Current protein-protein-interaction (PPI) databases suffer from human-made biases and lack of context. This is problematic for all downstream tools, such as drug discover, cell differentiation and communication or disease mechanism mining. CoBiNet aims to tackle these issues in three different work packages (WP) to provide a bias-aware and context-specific PPI database.
In this video, Konstantin Pelz from TUM gives a brief overview of the CoBiNet project:
Work-package specific information
WP1: Non-interactions
Repeatedly testing protein interactions with the same bait protein increases the risk of false positives. Most bait-prey studies test baits against unknown prey populations, meaning results only suggest potential interactions. Aggregating data from overlapping experiments introduces a systematic bias, as a large number of interactions of a protein might indicate repeated testing (and false positive) and lack true biological interactions. Our goal is to create a high confidence negatom, i.e. a curated dataset of protein pairs that have been shown not to interact. This project aims to construct such a resource using experimental data to infer reliably non-interacting pairs. The negatome is used to estimate the false positive ratio of experiments and train and score machine learning models without the systematic bias.
WP2: Tissue-specific interactions
Current PPI databases are unaware of the underlying context given by, for example, the tissue or the cell line. This leads to many false positive interactions that are not present in the actual context, which is especially problematic when performing personalized medicine. To fix this, we are developing novel deep-learning methods using graph representation learning to learn context-specific protein embeddings. We aim to use these context-specific protein embeddings in a second downstream deep learning model to predict the context-specific PPIs.
WP3: Isoform-specific Interactions
Current PPI databases are blind to alternative splicing (AS). To take that into account, we need to look at PPIs on a lower level and study domain-domain interactions (DDIs). Thus, we created DIGGER [DIGGER webpage, DIGGER paper], a resource that extends PPIs with DDIs to explain the effects AS has on the respective interactions. Our current focus is to obtain more high-confidence DDIs, so that we can increase the power of DIGGER. For that, we are exploring different directions from using simple statistical inference to making use of structural predictions like the ones from AlphaFold 3 to gain insight into the interaction patterns of domains and proteins.