Feature Story | 31-Jul-2002

The color of genomes

DOE/Pacific Northwest National Laboratory

New visualization techniques developed by scientists at Pacific Northwest National Laboratory allow researchers to compare and analyze genomes using a powerful tool that computers cannot replace--the human brain.

By presenting a color-coded graphic representation of genomes, people can easily identify similarities and differences. This approach may help identify individual genes responsible for certain properties and characteristics.

"Our visualization method allows us to look at whole genomes, while most methods look at just part," said Pak Chung Wong, who leads the project. While developing the technique, Wong and his colleagues Harlan Foote, Kwong Kwok Wong and Jim Thomas compared different strains of bacteria to determine what made them unique.

The researchers began by assigning a color to each of the four nucleotides in DNA--adenine, cytosine, guanine and thymine. They created a graphic representation of the genome where each colored pixel represented a single nucleotide in the sequence.

"It didn't work," Wong said. He explained that individual nucleotides change so frequently that there was too much "noise" to recognize patterns.

So, Wong and his colleagues began taking steps to reduce the noise, such as arranging the data in different curves. "The biological community has been mapping genomes from left to right because we read that way," Wong said. "The curves we use fold a one-dimensional genome sequence into a two-dimensional image and allow neighboring nucleotides to appear near each other."

Researchers apply other digital image processes that make distinguishing features more apparent. For example, they apply smoothing filters and adjust color, contrast and intensity to the image.

The technique also is useful in comparing multiple genomes. People can easily spot interesting areas and then use other methods to study them in more detail. On the other hand, computational methods require each data set to be compared fully against every other data set, an expensive approach that generates pages and pages of data.

Beyond genomes, researchers have shown how their methods could be applied to other sequential data, such as analyzing brain wave data from an electroencephalogram or EEG.

A patent is pending on the visualization technique and PNNL is interested in identifying a company to develop user interfaces and a software product based on this approach.

###

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.