Complex-Complex Interaction Resource

 

GENERAL INFO AND CREDITS:

This is a standalone FileMaker Pro Runtime solution with information on the human endogenous complexome that was created to be a tangible part of the following publication:

Malovannaya et al., Analysis of the Human Endogenous Coregulator Complexome, Cell - 27 May 2011 (Vol. 145, Issue 5, pp.787-799) (original submission November 2010)

Authors/Affiliations

Anna Malovannaya (1,2,*), Rainer B. Lanz (1,*), Sung Yun Jung (1,2), Yaroslava Bulynko (1), Nguyen T. Le (2), Doug W. Chan (1,2), Chen Ding (2), Yi Shi (2), Nur Yucer (2), Giedre Krenciute (2), Beom-Jun Kim (2), Chunshu Li (2), Rui Chen (3), Wei Li (1),Yi Wang (1,2), Bert W. O’Malley (1,$), Jun Qin (1,2,$) (1) Department of Molecular and Cellular Biology, (2) Center for Molecular Discovery, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, (3) Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA *equal first author contribution; $equal last author contribution

Anna Malovannaya, Rainer B. Lanz, and Jun Qin directly contributed to the development of the software solution presented here, and the in-house built databases that support infrastructure of this project.

This software intends to demonstrate a straightforward method by which a query for a protein will produce networks of interacting proteins grouped in minimal endogenous modules (MEMOs).

__________________________________________________________

DISCLAMER:

1. The CCI networks are provided as is and represent our best attempt (to date) in deconvoluting high-order complex-complex interactions. Many protein associations within these networks are expected to be indirect.

2. Our reciprocity constraint guarantees higher false-negative rates for transient interactions, but significantly improves omission of cross-reactivity - an issue particularly prevalent in HT-IP/MS data.

3. The logic of Near Neighbor Network (3N) analysis is directional. We evaluate the most PROMINENT steady-state associations for EACH protein of interest, even at CCI level. This means that "A-to-B" and "B-to-A" associations are not equivalent. They should not be mistaken for evidence of direct binding between A and B, which is always non-directional. In simpler words, just because protein A always associates with B does not mean that B is always found with A - in fact, the fraction of B that interacts with A could be very small, at which point it is likely to fall out of a CCI network. This accounts for CCIs 'directionality' in our resource. Only protein subunits within MEMOs - but not higher order associations - are most likely to always co-bind together.

__________________________________________________________

INSTALLATION/OPENING:

This software is packaged as two platform-specific (for Macintosh and Windows) FileMaker Pro (FMP) 11 runtime solutions and distributed free to scientists in academics and industry. There is no install/uninstall process per se - the files can be simply opened/run. Part of this software that pertains to the underlining framework of FMP Runtime applications is copyrighted by the FileMaker Inc. and associates (for details, see FMP_Acknowledgments.pdf files included with Demo materials). Below are the directions and recommendations for opening these files.

__________

For Macintosh/Apple users:

1. Open 'CCIResource_RT4Mac_v1_1' Folder. It contains 3 files (CCIResource_4Mac; doNOTuse_SupportingFile.USR; FMP_Acknowledgments.pdf) and an Extensions Folder with supporting libraries - do NOT delete any files.

2. Choose and open the CCIResource_4Mac file. By default, there is no log-in window; however, you can log-in using 'guest' and 'password' as name and password, respectively, should a log-in prompt appear.

__________

For PC/Windows users:

1. Open 'CCIResource_RT4Win_v1_1' Folder. It contains multiple files and folders with supporting data - do NOT delete any files. To simplify (omit) navigation through the supporting files in 4Win folder, we recommend that a shortcut of 'CCIResource_4Win.exe' is made and placed in a convenient location for file opening.

2. Find and open the 'CCIResource_4Win.exe' file (initial open is slower than subsequent runs).

3. When asked to enter your user name, any name can be used. This is an inherent prompt of the FMP Runtime for Windows; it will not appear again in subsequent runs.

4. By default, there is no log-in window; however, you can log-in using 'guest' and 'password' as name and password, respectively, should a log-in prompt appear.

__________________________________________________________

USER INSTRUCTIONS:

The majority of resource functionalities have in-solution mouse-over annotations that can be called by holding a pointer over the right bottom corner of column headers and data fields. Many data fields are hyperlinked to related information. Mouse-over annotations can be turned ON or OFF; this option is provided on all solution screens (top left corner).

____________

1. First layout lists all human protein-coding genes (in accordance to NCBI gene_info dataset). The purpose is to let users find their protein of interest by descriptive keyword, Gene Symbol, or Gene ID. Simply type in a gene name or keyword in the search box and click on the ‘Find’ button; the search results will be presented as list of proteins matching a query. If there are no matches, you will have an option to redirect to NCBI website to find a proper name for your protein of interest. If you do not see your protein of interest in query results, we advise that you find the official NCBI nomenclature and re-try your search.

2. Proteins are grouped in MEMOs that are coded as follows: AM = approved module PM = provisional module TM = temporary module (not enough data; equal to individual proteins) This annotation is followed by a 6-digit serial number for a given MEMO. Proteins that belong to the same MEMO have identical MEMO identifiers. Clicking on MEMO number will show all MEMO subunits in the same layout.

3. Right-hand-side column indicated whether an interaction network is available for viewing. CCI (complex-complex interaction) networks are coded as follows: - 'view' links to an available CCI network 'MEMO only' means that no proteins other than MEMO subunits pass cE-Filters, in case of 'singleton' MEMOs this means that no interactions are found for that proteins in our analysis - 'non-reciprocal' are proteins that have only been found with one antibody (their networks could not be determined with certainty) - 'suppressed (precipitation)' means that this protein is frequently found as non-specific precipitant (these networks are not shown) - 'suppressed (misidentification)' means that this protein is often wrongfully assigned by MS (their networks cannot be determined) - 'not in HT-IP/MS' proteins are not covered in current dataset Clicking on 'View' in the right-hand-side column will produce the interaction network for the protein of interest in a second 'CCI layout'.

4. CCI display screen shows most certain/predominant interactions for a given protein of interest. The display cells show MS identification counts (average per antibody repeats) for the corresponding gene products (listed in the left-side column). MEMOs are separated from each other by grey underlines. By default, CCI networks are sorted by CCI Rank. Since these 'interactions' are derived from affinity purification data, many protein associations are expected to be indirect in contexts of protein complexes. 5. The users have also an option to sort results by separate similarity scores (MM, JM, JE, JT, and RT). To this end, choosing "Sort by: protein" orders protein-protein interactions based of protein scores, regardless of MEMO assignmeents. Choosing "Sort by: MEMO" orders CCIs (complex-complex interactions) by average metrics for all MEMO subunits, and then by the corresponding protein scores. After choosing "protein" or "MEMO" in "Sort by", click on metrics column headers (MM, JM, JE, JT, or RT) to sort. CCI Rank is essentially a combination score of MM and Jaccard indices which respects MEMO assignments by definition; CCI Ranks cannot be sorted by protein. 6. Right-hand side of the display a the summary of the significant (qValue <= 0.25) copy number amplifications and deletions as reported by the Broad Institute Tumorscape study (Reference: "The landscape of somatic copy-number alteration across human cancers." Beroukhim, M. et al. Nature 2010 vol. 463 (7283) pp. 899-905; PMID:20164920). This information, when overlayed on the CCI networks derived by our study, was used as a basis for Figure 6. 7. To search for another protein, click the ‘Return’ button, which will bring you back to the 'Find' screen.

__________________________________________________________