Skip to content
This repository has been archived by the owner on Nov 8, 2021. It is now read-only.

From SATAY data to the prediction of the interaction map #18

Open
leilaicruz opened this issue Jul 23, 2020 · 6 comments
Open

From SATAY data to the prediction of the interaction map #18

leilaicruz opened this issue Jul 23, 2020 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@leilaicruz
Copy link
Member

This issue will be to draw ideas on possible ways to quantify the changes in the interaction map of budding yeast once one mutation is made, from the SATAy DATA output.

@leilaicruz leilaicruz created this issue from a note in SATAY-analysis-workflow-board (In progress) Jul 23, 2020
@leilaicruz leilaicruz self-assigned this Jul 23, 2020
@leilaicruz leilaicruz added enhancement New feature or request help wanted Extra attention is needed labels Jul 23, 2020
@leilaicruz
Copy link
Member Author

Idea 1: Look to the figure below 👇 where I described an idea of the feature matrix for the wild type strain to test and train a regression model in order to predict , in this case, the number of synthetic lethals(SL) per gene as a proxy for the interaction map.

The idea is then to build a good model that have more than 75% of accuracy in order to acceptably predict the number of SL of the genes of the analysed mutants. Every mutant (genetic background) has a feature matrix that will be entered into the model to make the prediction.

In this model I assume as the output variable , the number of SL per gene as a representation of the interaction map. But also could be the :

  • total number of interactions
  • ???
  • ???
    from-satay-to-change-of-the-interaction-map

@EKingma
Copy link
Collaborator

EKingma commented Jul 23, 2020

Idea 1: Look to the figure below 👇 where I described an idea of the feature matrix for the wild type strain to test and train a regression model in order to predict , in this case, the number of synthetic lethals(SL) per gene as a proxy for the interaction map.

The idea is then to build a good model that have more than 75% of accuracy in order to acceptably predict the number of SL of the genes of the analysed mutants. Every mutant (genetic background) has a feature matrix that will be entered into the model to make the prediction.

In this model I assume as the output variable , the number of SL per gene as a representation of the interaction map. But also could be the :

  • total number of interactions
  • ???
  • ???
    from-satay-to-change-of-the-interaction-map

Nice idea! I was wondering though,what would be the data you use to build your model in this case?

@leilaicruz
Copy link
Member Author

In principle, I would generate together with the student 20-30 single mutants to do SATAY on them and have this type of data 🙏🤞

@EKingma
Copy link
Collaborator

EKingma commented Jul 24, 2020

In principle, I would generate together with the student 20-30 single mutants to do SATAY on them and have this type of data 🙏🤞

Still, I don't really understand what you mean with this type of data? What would be the data that you would use to train/test the model?

@leilaicruz
Copy link
Member Author

Still, I don't really understand what you mean with this type of data? What would be the data that you would use to train/test the model?

The idea is to validate how well the model generated with the WT data can be extended to predict number of SL of the genes(or another proxy that represents the interaction map) in a different genetic background. And more importantly, we want to know how well we can predict number of SL of genes in WT using SATAY data.

@leilaicruz
Copy link
Member Author

leilaicruz commented Jul 24, 2020

One important thing to take into account is that the feature matrix should be built such as every feature gives information , in this case, of the different genetic backgrounds .

In this case the features related to the functional properties of every gene will only be available for the WT background . What I mean is that for the mutants , we actually dont know (there is no database with that info) how the "new" functions of the rest of the genes changes , so then those columns will remain constant in the mutants , and hence wont contribute to any new insight from the data.

To guide the thinking, I will reflect on:

  1. What else from the SATAY experiment can we extract that changes with the background, like the insertions and the reads?
  2. How can we model the functional analysis of every gene in a different genetic background? to integrate it in the features as info that changes per genetic background.
  3. What else can we use as an output for the model , that is known and relevant to connect with the interaction map?

@leilaicruz leilaicruz removed the help wanted Extra attention is needed label Jul 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
Development

No branches or pull requests

2 participants