|
Prediction of Interologs
We downloaded all experimentally-identified protein-protein interactions from BIOGRID (http://www.thebiogrid.org/), INTACT (http://www.ebi.ac.uk/intact/site/index.jsf), MINT (http://mint.bio.uniroma2.it/mint/Welcome.do), and HPRD (http://www.hprd.org/). The majority of the reported interactions in these databases come from Homo sapiens and experimental model organisms such as Rattus norvegicus, Drosophila melanogaster, Saccharomyces cerevisiae, Caenorhabditis elegans, Arabidopsis thaliana, and Escherichia coli K12 (called reference organisms) We then prepared a set of non-redundant interactions for each reference organism by merging the different PubMed IDs for the same interaction.
We transferred the interactions from each of the reference organisms to Mus musculus on the basis of orthology relationship predicted as the best hit by bi-directional BLASTP searches against all proteins using the e-value e-4 as a cut-off score. In case of redundant interologs in mouse, an interolog from evolutionarily more closely related species was selected.
Co-occurrence of mouse orthologs in predicted bacterial operons
We predicted operons in genomes of 186 different bacterial species. When two mouse orthologs were found in more than one predicted operons, we considered these two mouse proteins interacting.
Phylogenetic profiles
The interactions predicted by both approaches may include false positives due to the lack of evolutionary conservation of interactions or false positives in original interactions. We, therefore, filtered possible false positives in the predicted interactions using a phylogenetic profile.
We first built a model by training the Support Vector Machine (SVM) on phylogenetic profiles of positive and negative protein-protein interaction datasets. The model was used to predict the positives and false positives in the interactions predicted by above two methods.
|