In a paper published in the Sept. 1 issue of the journal Nature, the Chimpanzee Sequencing and Analysis Consortium, which is supported in part by the National Human Genome Research Institute (NHGRI), one of the National Institutes of Health (NIH), describes its landmark analysis comparing the genome of the chimp (Pan troglodytes) with that of human (Homo sapiens).
"The sequencing of the chimp genome is a historic achievement that is destined to lead to many more exciting discoveries with implications for human health," said NHGRI Director Francis S. Collins, M.D., Ph.D. "As we build upon the foundation laid by the Human Genome Project, it's become clear that comparing the human genome with the genomes of other organisms is an enormously powerful tool for understanding our own biology."
The chimp sequence draft represents the first non-human primate genome and the fourth mammalian genome described in a major scientific publication. A draft of the human genome sequence was published in February 2001, a draft of the mouse genome sequence was published in December 2002 and a draft of the rat sequence was published in March 2004. The essentially complete human sequence was published in October 2004.
"As our closest living evolutionary relatives, chimpanzees are especially suited to teach us about ourselves," said the study's senior author, Robert Waterston, M.D., Ph.D., chair of the Department of Genome Sciences of the University of Washington School of Medicine in Seattle. "We still do not have in our hands the answer to a most fundamental question: What makes us human? But this genomic comparison dramatically narrows the search for the key biological differences between the species."
The 67 researchers who took part in the Chimp Sequencing and Analysis Consortium share authorship of the Nature paper. Most of the work of sequencing and assembling the chimp genome was done at the Broad Institute of the Massachusetts Institute of Technology and Harvard University, Cambridge, Mass., and the Washington University School of Medicine in Saint Louis. In addition to those centers, the consortium included researchers from institutions elsewhere in the United States, as well as Israel, Italy, Germany and Spain.
The DNA used to sequence the chimp genome came from the blood of a male chimpanzee named Clint at theYerkes National Primate Research Center in Atlanta. Clint died last year from heart failure at the relatively young age of 24, but two cell lines from the primate have been preserved at the Coriell Institute for Medical Research in Camden, N.J.
The consortium found that the chimp and human genomes are very similar and encode very similar proteins. The DNA sequence that can be directly compared between the two genomes is almost 99 percent identical. When DNA insertions and deletions are taken into account, humans and chimps still share 96 percent of their sequence. At the protein level, 29 percent of genes code for the same amino sequences in chimps and humans. In fact, the typical human protein has accumulated just one unique change since chimps and humans diverged from a common ancestor about 6 million years ago.
To put this into perspective, the number of genetic differences between humans and chimps is approximately 60 times less than that seen between human and mouse and about 10 times less than between the mouse and rat. On the other hand, the number of genetic differences between a human and a chimp is about 10 times more than between any two humans.
The researchers discovered that a few classes of genes are changing unusually quickly in both humans and chimpanzees compared with other mammals. These classes include genes involved in perception of sound, transmission of nerve signals, production of sperm and cellular transport of electrically charged molecules called ions. Researchers suspect the rapid evolution of these genes may have contributed to the special characteristics of primates, but further studies are needed to explore the possibilities.
The genomic analyses also showed that humans and chimps appear to have accumulated more potentially deleterious mutations in their genomes over the course of evolution than have mice, rats and other rodents. While such mutations can cause diseases that may erode a species' overall fitness, they may have also made primates more adaptable to rapid environmental changes and enabled them to achieve unique evolutionary adaptations, researchers said.
Despite the many similarities found between human and chimp genomes, the researchers emphasized that important differences exist between the two species. About 35 million DNA base pairs differ between the shared portions of the two genomes, each of which, like most mammalian genomes, contains about 3 billion base pairs. In addition, there are another 5 million sites that differ because of an insertion or deletion in one of the lineages, along with a much smaller number of chromosomal rearrangements. Most of these differences lie in what is believed to be DNA of little or no function. However, as many as 3 million of the differences may lie in crucial protein-coding genes or other functional areas of the genome.
"As the sequences of other mammals and primates emerge in the next couple of years, we will be able to determine what DNA sequence changes are specific to the human lineage. The genetic changes that distinguish humans from chimps will likely be a very small fraction of this set," said the study's lead author, Tarjei S. Mikkelsen of the Broad Institute of MIT and Harvard. Among the genetic changes that researchers will be looking for are those that may be related to the human-specific features of walking upright on two feet, a greatly enlarged brain and complex language skills.
Although the statistical signals are relatively weak, a few classes of genes appear to be evolving more rapidly in humans than in chimps. The single strongest outlier involves genes that code for transcription factors, which are molecules that regulate the activity of other genes and that play key roles in embryonic development.
A small number of other genes have undergone even more dramatic changes. More than 50 genes present in the human genome are missing or partially deleted from the chimp genome. The corresponding number of gene deletions in the human genome is not yet precisely known. For genes with known functions, potential implications of these changes can already be discerned.
For example, the researchers found that three key genes involved in inflammation appear to be deleted in the chimp genome, possibly explaining some of the known differences between chimps and humans in respect to immune and inflammatory response. On the other hand, humans appear to have lost the function of the caspase-12 gene, which produces an enzyme that may help protect other animals against Alzheimer's disease.
"This represents just the tip of the iceberg when it comes to exploring the genomic roots of our biological differences," said one of the study's co-authors LaDeana W. Hillier of the Genome Sequencing Center at Washington University School of Medicine. "As more is learned about other functional elements of the genome, we anticipate that other important differences outside of the protein-coding genes will emerge."
Armed with the chimp sequence, researchers also scanned the entire human genome for deviations from normal mutation patterns. Such deviations may reveal regions of "selective sweeps," which occur when a mutation arises in a population and is so advantageous that it spreads throughout the population within a few hundred generations and eventually becomes "normal."
The researchers found six regions in the human genome that have strong signatures of selective sweeps over the past 250,000 years. One region contains more than 50 genes, while another contains no known genes and lies in an area that scientists refer to as a "gene desert." Intriguingly, this gene desert may contain elements regulating the expression of a nearby protocadherin gene, which has been implicated in patterning of the nervous system. A seventh region with moderately strong signals contains the FOXP2 and CFTR genes. FOXP2 has been implicated in the acquisition of speech in humans. CFTR, which codes for a protein involved in ion transport and, if mutated, can cause the fatal disease cystic fibrosis, is thought to be the target of positive selection in European populations.
The chimp and human genome sequences, along with those of a wide range of other organisms such as mouse, honey bee, roundworm and yeast, can be accessed through the following public genome browsers: GenBank (www.ncbi.nih.gov/Genbank) at NIH's National Center for Biotechnology Information (NCBI); the UCSC Genome Browser (www.genome.ucsc.edu) at the University of California at Santa Cruz; the Ensembl Genome Browser (www.ensembl.org) at the Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute; the DNA Data Bank of Japan (www.ddbj.nih.ac.jp); and EMBL-Bank (www.ebi.ac.uk/embl/index.html) at the European Molecular Biology Laboratory's Nucleotide Sequence Database.
NHGRI is one of 27 institutes and centers at the NIH, an agency of the Department of Health and Human Services. The NHGRI Division of Extramural Research supports grants for research and for training and career development at sites nationwide. Additional information about NHGRI can be found at its Web site, www.genome.gov.
Journal
Nature