Amylase and the Power of Molecular Biology

In preparing for a presentation recently, I came across these two papers: (Here and here--the papers are old enough that they are available for free.) I think they are rather interesting, so I'm going to do my best to tell you about them in layman's terms, although it will help if you have a basic grasp of biology.

Amylase is an enzyme, made in the pancreas of all vertebrates, that helps to break down complex sugars. It is also made in the salivary glands of some mammals. The human genes that code for amylase have a interesting story to tell.

This image is a schematic representation of the DNA just in front of the human amylase genes:

The first thing to note is that there are five genes: AMY1A, AMY1B, AMY1C, AMY2A, and AMY2B. The AMY1 genes are all expressed in the saliva. Only one representation accompanies them because they are all essentially the same. The AMY2 genes are expressed in the pancreas. However, the DNA in front of the two genes has some differences. Let's look at each structure in more detail.

AMY2B: The arrow at a right angle is showing where transcription starts, which means this is where the cellular machinary starts reading the genetic code for making the protein. It has a "Pan," indicating that this occurs in the pancreas. To the left is a shaded box labeled "gamma-actin". This is a pseudogene that inserted just in front of the amylase gene. (Pseudogenes are copies of genes that are re-inserted into the genome, but lose coding function due to mutation.) The letters and arrows underneath represent primers used in PCR--they need not concern us, so ignore them.

AMY1A/B/C: This schematic is similar to the one I just described. In this case, transcription starts a little upstream, and the "Sal" indicates expression in the saliva. The shaded box again indicates the presence of the gamma-actin pseduogene, but it has been interrupted by the insertion of an endogenous retrovirus. As part of their replication cycle, Retroviruses (such as HIV) integrate their genomes into the host genome and are flanked by Long Terminal Repeats (LTRs) when they do so. If the integration occurs in a germline cell, the viral genome will be passed on to progeny as part of the host genome. In other words, this viral genome is now part of your DNA. Over time, mutations occur which can disable the viral proteins. Evidently a retrovirus, whose proteins have likely lost function, integrated in the middle of the gamma-actin pseudogene.

AMY2A: This schematic is similar to the first two (this gene is expressed in the pancreas) except that most of the retrovirus is gone--there is just one LTR in its place. The two LTRs are made by the virus, and are identical to each other at integration. Because of this, sometimes homologous recombination between the integrated LTRs occurs which results in the removal of most of the viral genome, leaving only a solo LTR behind.

Now it is rather unlikely that a pseudogene would independently insert (5 times) in the same place in front of multiple amylase genes. Again, it is unlikely that a retrovirus would independently integrate (4 times) in to the same place in multiple actin pseudogenes. The most likely explanation for the pattern we see here is that there were duplication events within the genome.

This figure shows a proposed sequence of events (click to enlarge):

An actin pseudogene inserted in front of the original amylase gene. This new formation was duplicated within the genome. One of those is what we know as AMY2B. The other went through additional modifications. First was a deletion of part of the pseduogene (look at the first figure and you'll see that the pseudogene box is longest in AMY2B). Then the retrovirus inserted into the pseudogene and this formation was duplicated. In one case, the LTRs recombined to remove the viral genome, giving us AMY2A. In the other case there were two more duplications to give AMY1A/B/C. Using sequence differences between the five genes, estimates of the times of these events are given (mya = million years ago).

How do we know that the amylase gene did not start out with the actin pseudogene in front of it? Well the pseudogene is not present in other animals such as rodents or New World Monkeys (ie. American). This is shown in the next figure:

Squirrel monkeys are New World Monkeys and do not make amylase in their saliva--only the pancreas--which is why it is being compared with human AMY2B. There is no sign of the actin pseudogene in front of their amylase gene.

Now it gets interesting: New World Monkeys lack the pseudogene and do not make amylase in their saliva; Old World Monkeys have the pseudogene (one copy has a deletion) and do make amylase in their saliva; Apes and Humans also have the pseudogene, as well as the retrovirus, and they both make amylase in their saliva. This is represented in the following figure:

Noted in the figure is a portion of the retrovirus which is sufficient to target gene expression to the salivary glands (this has been done experimentally in mice) but it cannot account for all salivary amylase production because the Old World Monkeys also make amylase in their saliva.

These data are most consistent with the following scenario: Ancestral primates (and other mammals) simply had an amylase gene that was expressed in the pancreas. After New World Monkeys and Old World Monkeys diverged, the gamma-actin pseudogene inserted in front of the amylase gene and a duplication occured. In one of the copies a deletion occured in the pseudogene. After the Apes and Old World Monkeys diverged, a retrovirus integrated into the pseudogene in Apes, leaving Humans and Apes with similar structure in their amylase genes. (Look carefully and you can transfer in information of figures 2 and 4 onto one another.)

Remember I said that there are a few other animals that also make amylase in their saliva. This appears to be a feature that is independent from primates because the regulatory DNA responsible is totally different.

There are several lessons in all of this. First, none of this makes sense outside of the context of common descent. Second, it is an example of gene duplication--the geneology of genes within the genome--and their ability to take on slightly different functions as a result. (For another example, see here.) And third, it shows how pseudogenes and retroviruses can interact with and shape genomes, and that they are powerful markers of relationships.

Postscript: Just for fun, I went to the website that hosts the human genome, which is freely accessible. I was able to zoom in on the part of chromosome #1 that contains the amylase genes. On the left you can see a larger map of the chromsome. The amylase genes are just on the upper arm of the chromosome, toward the middle (centromere). On the right you can see the designation of the amylase genes I discussed. If I were to zoom in closer, I could look at the actual DNA sequence. Interestingly, there is something called "AMYP1". This is an amylase pseudogene--ie. a non-functional copy of one of the amylase genes, which means it is presumably "junk DNA".

[This is a cross-post from LDS Science Review.]


Jared, this is outstanding stuff. Thanks for taking the time to show it in detail to us. These non-coding, non-functional, "messy" accidental features have always seemed to me to be the strongest indicators that the human body arose from common descent, and was not simply created separately using similar functional parts as those found in other animals.

I have read of this concept, but never walked through such a detailed example before; so I thank you again. Coming face-to-face with such data is clearly an important step for anyone to recognize the need for a rethinking of traditional ideas. 

Posted by Christian Y. Cardall

5/25/2005 06:06:00 AM  

sir can u please mail me about 15 pages for production of amylase n pectinase.it's my school assignment n i m not getting it anywhere. so sir if u can mail it plz send it before 31 december. i'll always b grateful 2 u - NEHA SRIVASTAVA

Posted by NEHA

12/29/2005 10:32:00 PM  



<< Home