Help: Quest for Orthologs - Challenge 6
This service is developed by the Quest for Orthologs consortium and maintained by OpenEBench. Proteins and functional modules are evolutionarily conserved even between distantly related species, and allow knowledge transfer between well-characterized model organisms and human. The underlying biological concept is called ‘Orthology’ and the identification of gene relationships is the basis for comparative studies.
The content of phylogenomic databases differs in many ways, such as the number of species, taxonomic range, sampling density, and applied methodology. What is more, phylogenomic databases differ in their concepts, making a comparison difficult – for the benchmarking of analysis results as well as for the user community to select the most appropriate database for a particular experiment.
The Quest for Orthologs (QfO) is a joint effort to benchmark, improve and standardize orthology predictions through collaboration, the use of shared reference datasets, and evaluation of emerging new methods.
Summary of service
The identification of orthologs is an important cornerstone for many comparative, evolutionary and functional genomics analyses. Yet, the true evolutionary history of genes is generally unknown. Because of the wide range of possible applications and taxonomic interests, benchmarking of orthology predictions remains a difficult challenge for methods developers and users.
This community developed web-service aims at simplifying and standardizing orthology benchmarking. And for the users, the benchmarks provide a way to identify the most effective methods for the problem at hand.
How does it work?
An orthology method developer should first infer the orthologs using the reference proteome dataset. The service will assess the induced pairwise orthologous relations. Therefore the method developer must provide the predictions in a format from which the pairwise orthologous predictions can be extracted in an unambiguous way.
Once the predictions have been uploaded, the service ensures that only predictions among valid reference proteomes are provided. Benchmarks are then selected and run in parallel. Finally, statistical analyses of method accuracy are performed on each benchmark dataset. The raw data and summary results in form of precision-recall curves are stored and provided to the submitter.
Protein Reference Dataset
Orthology inference is most often based on molecular protein sequences. For a comparison of different orthology prediction methods, a common set of sequences must be established. Therefore, only identical proteins are mapped to each other.
To make comparisons of method easier, the orthology research community has agreed in 2009 to established a common QfO reference proteome dataset. The inital dataset of 2011 is still available, but its further usage is discuraged. Please consider using the newest available dataset.
NEW Aug 2018: We have deployed the QfO reference proteome dataset of 2018. Be among the first to try it out and upload predictions for this new dataset.
Our benchmarks assess orthology on the bases of protein pairs. Therefore, we ask our users to upload their prediction in a format from which we can extract pairwise relations in an unambiguous manner: We support
- simple simple text file with two tab-separated columns of orthologous protein represented by their ids
- orthoxml v0.3 , which allows for nested orthologGroups and paralogGroups.
For both formats, we expect you to submit your predictions in a single file. This file might also be compressed by gzip or bzip2. In that case, it needs to have the proper filename extention (.gz or .bz2).
How to cite the orthology benchmark service
Adrian M Altenhoff, Brigitte Boeckmann, Salvador Capella-Gutierrez, Daniel A Dalquen, Todd DeLuca, Kristoffer Forslund, Jaime Huerta-Cepas, Benjamin Linard, Cécile Pereira, Leszek P Pryszcz, Fabian Schreiber, Alan Sousa da Silva, Damian Szklarczyk, Clément-Marie Train, Peer Bork, Odile Lecompte, Christian von Mering, Ioannis Xenarios, Kimmen Sjölander, Lars Juhl Jensen, Maria J Martin, Matthieu Muffato, Toni Gabaldón, Suzanna E Lewis, Paul D Thomas, Erik Sonnhammer, Christophe Dessimoz. 2016. Standardized benchmarking in the quest for orthologs.