The map has 2,100 unique landmarks - three times as many as any previous X chromosome map. If it were a road map from St. Louis to San Francisco, it would show a marker every mile.
The researchers also located hot spots for genes and detected a large region where the DNA remains intact as it passes from one generation to another.
The map is speeding up the search for disease genes on X, which is associated with many inherited disorders. "And the completion of a map with this level of detail has made X one of the earliest chromosomes for DNA sequencing - the next phase of the Human Genome Project," says David Schlessinger, Ph.D., director of the Center for Genetics in Medicine and principal investigator for the X project. Ramaiah Nagaraja, Ph.D., research instructor in molecular microbiology, is the paper's lead author.
Chromosome X determines gender - women have two copies and men have one X and one Y. X's DNA is one long double helix - 160 million nucleotide base pairs. On average, the new map has a landmark every 75,000 base pairs. The national goal for chromosome mapping is one landmark every 100,000 base pairs.
Whereas someone mapping a road could drive along the route and record landmarks in sequence, the researchers had a much more difficult task. They started with more than 5,000 fragments from seven different libraries of human DNA. They then identified unique landmarks on the fragments. If two pieces contained the same landmark, they knew these fragments must overlap. By painstakingly aligning all the pieces of DNA, they mapped the entire length of X.
A method for cloning large pieces of DNA made this jigsaw puzzle manageable. In the 1980s, David T. Burke, then a graduate student in the laboratory of Maynard V. Olson, Ph.D., invented the yeast artificial chromosome or YAC. This construct contains a segment of, say, human DNA and structures that make it behave like a yeast chromosome. As yeast cells divide, they copy the artificial chromosome over and over, generating sufficient DNA for analysis. "Because each YAC can contain hundreds of thousands of base pairs, a reasonable number of YACs fit along a chromosome," says Schlessinger, who also is a professor of molecular microbiology, genetics and medicine. "Before YACs, we could clone less than one-tenth as much DNA in a single piece."
Finding features that could act as landmarks was another key development. In 1990, Olson and Eric D. Green, then an M.D./Ph.D. student, unveiled a strategy to use the polymerase chain reaction (PCR) - an enzymatic method for copying specific DNA sequences - to locate short, unique segments within YACs. These snippets of about 300 base pairs - called sequence-tagged sites (STSs) - could act as landmarks on chromosome maps the way highway exits and rest areas punctuate road maps, the researchers reasoned.
"You get small fragments of X and sequence them to find out which are unique," Schlessinger says. "Then you use these STSs as primers for PCR so you can determine which YACs contain that sequence. The cleverness of this system is that it automatically gives you the landmarks and the map at the same time."
Rather than conducting a random search for YACs that contained the same STS, Schlessinger and colleagues used a technique called chromosome walking to systematically work their way along the DNA and align sequential fragments. "We started out with a selection of YACs and made STSs from the ends of those clones," Schlessinger explains. "Then we kept screening all the other clones to find the next one. More than 1,500 screenings were required."
They had to develop new software to order and store this vast amount of data. Philip P. Green, Ph.D., devised several programs, including SEGMAP, which has proved particularly valuable. "Every night, SEGMAP regenerated our map based on the day's new information," Schlessinger explains. "It showed us the order of the YACs and their markers , telling us which data were ambiguous and which were certain."
The project's completion has permitted the first comparison between a physical map and a genetic map of a chromosome. Genetic maps are constructed by studying the passage of traits from one generation to another. The closer two genes are on a chromosome, the less likely they are to get separated as chromosomes swap genetic material during egg and sperm formation. Distances on genetic maps can differ greatly from those on physical maps, however, because some regions of chromosomes recombine more often than others.
The genetic map of X has a few hundred markers. When the researchers compared it with their map of X, they found an area in the middle that corresponds with a much longer stretch - 17 million base pairs - of the physical map. "So this region is uneventful on the genetic map, whereas it contains a whole bunch of markers on the physical map," Schlessinger says. "But we don't know why the X chromosome should have this large area of poor recombination."
He speculates that the answer may involve the X inactivation locus, which in women turns off most of the genes on one copy of X, leaving the other to direct biological activities. The region of low recombination is on X's long arm, beginning near the X inactivation locus and ending at a distinctive region that also is seen on the Y chromosome.
The researchers were able to determine how the chemical composition of X varies along the chromosome because the 2,100 STSs provide a representative sample of X's DNA. Four types of nucleotides form the building blocks of DNA - A, C, G and T - and any sample contains as much A as T and as much C as G. For some unknown reason, regions that are rich in genes have a higher G+C content than noncoding regions. "We found a region near the end of the long arm that is very rich in G+C," Schlessinger says. "Four other regions also had a high percentage. So the map has given us an early estimate of the relative density of genes across the chromosome."
The project also enabled Schlessinger and colleagues to locate several disease genes as YACs containing the relevant regions of X became available. They found the gene for an overgrowth disorder called Simpson-Golabi-Behmel syndrome and a gene for ectodermal dysplasia, which impairs the development of hair follicles, teeth and sweat glands. They also were part of an international team that tracked down the gene for fragile X syndrome, the second most common cause of mental retardation. And they have mapped and are analyzing genes that prematurely halt ovarian function.
The X project began in 1987 after the invention of the YAC raised the possibility of large-scale human genome mapping. This prospect prompted the James S. McDonnell Foundation to establish the Center for Genetics in Medicine with a $1.8 million grant. "The foundation funding supported the pilot studies that proved our mapping techniques would work," Schlessinger says.
In 1990, the National Institute of Human Genome Research (NCHGR) made the facility one of the first four federally funded genome centers. NCHGR has supported the center with two consecutive four-year grants totaling $27.7 million. The grants financed the organization of DNA libraries and YACs, fostered technological developments that are widely used around the world, and funded the mapping of both X and chromosome 7. Eric Green began a high-resolution map of 7 at Washington University and completed it at the National Institutes of Health, but the data are not yet published.
Schlessinger's associates at Applied Biosystems Division, Perkin-Elmer, in San Francisco and at Washington University's Genome Sequencing Center now are sequencing portions of chromosome X, using materials and markers from the mapping project. "This will determine the entire nucleotide sequence of X and locate all the genes along the chromosome," Schlessinger says.
Nagaraja R, MacMillan S, Kere J, Jones C, Cox S, Schmatz M, Terrell J, Shomaker M, Jermak C, Hott C, Masisi M, Mumm S, Srivastava A, Pilia G, Featherstone T, Mazzarella R, Kesterson S, McCauley B, Railey B, Burough F, Nowotny V, D'Urso M, States D, Brownstein B, Schlessinger D. 1997. X chromosome map at 75 Kb STS resolution, revealing extremes of recombination and GC content. Genome Research, 7(3): 210-222.