A Tipping Point for Function Prediction

There comes a tipping point in systems-biology studies of gene function where knowing some genes’ functions can, using a computational approach, help hone in on the functions of other genes. That point has already been reached for yeast and C. elegans but is just now being reached for systems where functional information is more sparse—such as in plants and humans.


This functional network of Arabidopsis genes shows the top 10% of the functional links identified by AraNet. Each line represents the connection between two genes and is colored to reflect the likelihood score for a relationship between the paired genes’ functions: Red means a high score, blue is low. For example, the red area in the middle top of the figure represents the ribosomal complex, while the large blue cluster to the right represents the phosphatases, which have a weak relationship to one another although they share enough biological behavior to be linked. Image courtesy of Sue Rhee, Edward Marcotte and Insuk Lee.“There are still a lot of plant genes with unknown functions,” says Sue Rhee, PhD, in the plant biology department at the Carnegie Institution for Science. “We need more sophisticated ways to characterize what these genes are doing.”


So she and her colleagues, including Edward Marcotte, PhD, at the University of Texas, Austin, and Insuk Lee, PhD, at Yonsei University, South Korea, modified the C. elegans and yeast algorithm for use in systems with less complete data. This produced a rational approach to predicting gene function in Arabidopsis thaliana, a plant widely studied by plant geneticists. Dubbed AraNet, the work was published in the February 2009 issue of Nature Biotechnology. Marcotte and Lee are currently using the same approach to study gene function in humans.


“The idea is that we’re making functional links between genes based on their behavior in a lot of different assays,” Rhee says, including microarray analyses, protein-protein interactions and inferences from animal orthologs culminating in 24 different data sets.


The researchers started by analyzing pairs of genes with known function in order to set a baseline score for inferring related function. They then looked at about 27000 Arabidopsis genes—most of which are uncharacterized—to identify possible gene-gene associations among them. “By then asking ‘what are the functions of the neighboring genes?’ we can try to infer the functions of the uncharacterized genes,” Rhee says. When her team experimentally tested the predictions for three uncharacterized genes, two out of the three had functions that were predicted by the network.


Rhee is interested in using inferences from AraNet to narrow down the candidate genes involved in complex traits. Although she’ll be doing this work in plants, Rhee says the approach will be applicable to all organisms. She’s also curious about uncharacterized genes that are connected only to other uncharacterized genes.  “Perhaps we can use the network to characterize some undiscovered processes.”


Ideally, Rhee says, researchers will combine AraNet’s predicted functions with their own knowhow to try to design the best sorts of experiments to conduct. It’s like rational drug design, she says: “You’re using all the available information to be as systematic as possible in designing your experiments. This is a good application of systems biology.”



Post new comment

The content of this field is kept private and will not be shown publicly.
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Enter the characters shown in the image.