Public Release: 

New Software Improves Accuracy Of Amino Acid Sequence Identification

Ohio University

ATHENS, Ohio -- Researchers at Ohio University have developed computer software that identifies sequences of amino acids in proteins more accurately than current identification software programs. The software could aid scientists working to isolate genes in the body, a process that includes identifying proteins by their amino acid sequences.

One of the problems with current sequence identifiers is that they have between a 13 and 21 percent misidentification rate. To verify results, scientists must reconfigure amino acid sequence by hand, a time-consuming process.

The new software, developed by researchers in the Center for Intelligent Chemical Instrumentation and the Center for Research and Technology at Ohio University, is at least twice as accurate as conventional methods, said Peter Harrington, associate professor of chemistry and an inventor of the software.

"Our main goal was to reduce the misidentification rate because that would minimize the amount of time it takes to identify an amino acid sequence," Harrington said.

The software is an "expert system" -- a computer program that combines human intelligence with an automated system. Harrington interviewed biochemists to find out how they manually determine the sequence of amino acids in a protein and how they accommodate for problems that can lead to misidentification.

"We took the same rules they apply and encoded them into the software program," Harrington said. "The result is a program that appears to have a lower misclassification rate than any other software."

Proteins in the body are made up of thousands of amino acids and the sequence of the amino acids determines the protein type. Currently, scientists identify proteins by isolating smaller amino acid sequences within a protein. To do this, they chemically remove the last amino acid in a chain of about 20 and analyze it using a special instrument. Amino acids travel through this instrument at different speeds, and software in the instrument identifies amino acid sequences based on their different migration times.

Misidentification can occur when two different amino acids travel at close speeds or when the chemical cleavage of the last amino acid is incomplete. Humans examining these results could see the mistake, but current computer software programs are not designed to detect these errors, Harrington said.

"The current software is based on numerical calculations," Harrington said. "What we tried to do was to make the identifications the way a human does visually, as opposed to the way a computer does numerically. By combining the two approaches, we reduced misidentification."

The next step will be to program the software to read more data and process different types of information about amino acids that will improve its accuracy even more, Harrington said.

The research was published in a recent issue of the journal Computer Applications in the Biosciences. Other researchers involved in the project include Peter Johnson, professor of chemistry, Elaine Saulinskas, a research technician in chemistry, and Lijuan Hu, a graduate student in chemistry, all at Ohio University.

Contact: Peter Harrington, 614-593-2099; Written by Kelli Whitlock, 614-593-0383;


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.