News Release

Making sense of the genome

Peer-Reviewed Publication

BMC (BioMed Central)

Almost every week we hear of a new genome sequence being completed, yet turning sequence information into knowledge about what individual genes do is very difficult. An article published in Journal of Biology this week will simplify this task, as it describes a new online tool that dramatically improves predictions of how individual genes are regulated.

Dr. Wyeth Wasserman and his team have created this powerful new two-step method for identifying which regulators of gene expression, called transcription factors, are in control of individual genes. The new method is far more selective than its predecessors, reducing the number of biologically irrelevant transcription factors identified in a search by 85%. The researchers have now made the tool available through an easy to use website called ConSite.

This web-based tool will be particularly helpful in analysing genes whose coding sequences do not give any clues as to their function. Around 30% of the predicted human genes contain no recognisable domains. Through knowing which transcription factors control the expression of a particular gene, scientists can get an idea as to what processes the gene is involved in. This is because transcription factors are themselves tightly controlled to ensure that a gene is only expressed when and where it is needed, and a great deal is already known about which events activate which transcription factors.

"Knowledge of the identity of a mediating transcription factor can give important insights into the function of a gene," according to the authors of the article.

Transcription factors act by binding to specific sequences in a regulatory region that is located in the DNA upstream of the coding region. But they can tolerate a large amount of variation in these sequences. This means that searching an upstream regulatory region for transcription factor binding sites identifies a large number of such sites, most of which are biologically irrelevant.

The researchers successfully increased the signal to noise ratio of such searches by using a powerful combination of two methods. Firstly, the regulatory sequences are scanned for binding sites, but only for those that are known to be biologically active. For this comparison, Wasserman's team compiled a searchable database of 108 transcription factor binding profiles from the relevant literature. The sites listed originate from mammals, insects and nematodes, and all are supported by good experimental evidence. These experiments provide essential information about the in vivo properties necessary for binding that are not contained in the sequence alone.

Secondly, the researchers use an alignment tool to compare the regulatory sequences of the same gene from two different species, and check which sites are conserved across evolution. "The most valuable information in the search for regulatory regions in genomic sequences is conservation. If a region is found to be conserved between a human genomic sequence and an orthologous genomic sequence from a distantly related organism, it is extremely likely to have a biological role," write the authors.

To test their two-step method, the researchers used it to identify the transcription factors that bind to the upstream regulatory regions of 14 well-studied genes. Using human and mouse sequences, the researchers found that all of the transcription factors identified did have a biological role and only a few of the physiologically relevant regulators were missed. A second test showed that the evolutionary distance between the two input sequences was vital in determining the effectiveness of the combined method.

This tool is now available to all scientists free of charge via the ConSite website: http://www.phylofoot.org/ Any scientist with a gene of interest will be able to input the regulatory sequence of their pet gene with or without the regulatory sequence of an orthologous gene into the ConSite tool, and will be rewarded with a list of probable regulators.

###

This article is available under embargo to members of the press at: http://jbiol.com/press

Upon publication on 22nd May at 13:00 GMT this article will be freely available online, according to BioMed Central's policy of open access to research articles: http://jbiol.com/content/2/2/13

Identification of conserved regulatory elements by comparative genome analysis Boris Lenhard, Albin Sandelin, Luis Mendoza, Pär Engström, Niclas Jareborg, and Wyeth W Wasserman Journal of Biology 2003, 2:13 (published 22 May 2003)

Please publish the URL in any news report so that your readers will be able to read the original paper.

Contact one of the authors Dr. Wyeth Wasserman for further information about this research, wyeth@cmmt.ubc.ca

Alternatively contact Gemma Bradley by email at press@biomedcentral.com or by phone on 44-207-323-0323 x2331.

Journal of Biology (http://jbiol.com) is published by BioMed Central (http://www.biomedcentral.com), an independent online publishing house committed to providing immediate free access to peer-reviewed biological and medical research. This commitment is based on the view that open access to research is essential to the rapid and efficient communication of science. In addition to open-access original research, BioMed Central also publishes reviews and other subscription-based content.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.