Scientists from the Structural Computational Biology Group at the Spanish National Cancer Research Centre (CNIO), led by Alfonso Valencia, together with French and American researchers, have published recently two articles in the journal Nucleic Acid Research (NAR) that introduce two new databases for studying the human genome.
Living eukaryote beings are capable of generating several proteins from the information contained in a single gene. This special characteristic exists partly thanks to the alternative splicing process that selectively joins some exons (the regions of genes that produce proteins) and not others, in order to produce the proteins needed in each moment.
The articles published by Valencia study the transfer of this information, which is contained in the intermediary molecules between the genes and the proteins—the RNAs —, and which will be used to understand the genome, the way it functions and the role of some of its variants in the origin of human illnesses like cancer.
An illustrative example of the relationship between RNAs and illness is the chronic lymphocytic leukemia. Researchers from the Chronic Lymphocytic Leukemia Spanish Consortium (CLL-ICGC), of which Valencia's team forms part, have observed an accumulation of mutations in the genes responsible for the splicing process. These observations suggest that alterations in these mechanisms might be the cause of the disease.
COMPILING FUNCTIONAL DATA
One of the articles published in NAR makes the APPRIS database available to the public and contains an integrated computational system that identifies those alternative splicing protein variants that are most relevant for cells.
This new database has brought together functional variants of 85% of the human genome, which "turns it into a powerful tool for analysing specific mutations in protein variants related to illness", says Michael Tess, the lead author of the article.
The APPRIS system is part of the ENCODE international project, in which more than 400 scientists from 32 laboratories in the UK, the US, Singapore, Japan, Switzerland and Spain have taken part.
A CATALOGUE OF MORE THAN 16.000 CHIMERAS
The process of splicing becomes more complex with chimeric RNAs, which are produced by the joining of exons from different genes. The second article published in NAR explains the ChiTaRS database, in which more than 16.000 chimeric RNAs from humans, mice and the fruit fly Drosophila melanogaster are brought together.
Furthermore, ChiTaRS relates some of these chimeric RNAs to chromosome alterations that are present in different types of cancer.
The entries in ChiTaRS are incorporated into the universal UniProt Knowledgebase system (UniProtKB), that contains a broad catalogue of information on proteins from laboratories around the world.
"The RNAs and chimeric proteins have become a powerful tool for researchers over the past few years, as they can be used as new cancer markers, as well as possible targets for the generation of new drugs," says Milana Frenkel-Morgenstern, the first author of the study. This new catalogue will also help further understanding of the evolution of chimeric RNAs in eukaryotes and their functions in organisms.
Reference articles:
APPRIS: annotation of principal and alternative splice isoforms. Jose Manuel Rodriguez, Paolo Maietta, Iakes Ezkurdia, Alessandro Pietrelli, Jan-Jaap Wesselink, Gonzalo Lopez, Alfonso Valencia, Michael L. Tress. doi:10.1093/nar/gks1058.
ChiTaRS: a database of human, mouse and fruit fly chimeric transcripts and RNA-sequencing data. Milana Frenkel-Morgenstern, Alessandro Gorohovski, Vincent Lacroix, Mark Rogers, Kristina Ibanez, Cesar Boullosa, Eduardo Andres Leon, Asa Ben-Hur, Alfonso Valencia. doi:10.1093/nar/gks1041.
Journal
Nucleic Acids Research