Filtering by
- All Subjects: Machine Learning
The field of biomedical research relies on the knowledge of binding interactions between various proteins of interest to create novel molecular targets for therapeutic purposes. While many of these interactions remain a mystery, knowledge of these properties and interactions could have significant medical applications in terms of understanding cell signaling and immunological defenses. Furthermore, there is evidence that machine learning and peptide microarrays can be used to make reliable predictions of where proteins could interact with each other without the definitive knowledge of the interactions. In this case, a neural network was used to predict the unknown binding interactions of TNFR2 onto LT-ɑ and TRAF2, and PD-L1 onto CD80, based off of the binding data from a sampling of protein-peptide interactions on a microarray. The accuracy and reliability of these predictions would rely on future research to confirm the interactions of these proteins, but the knowledge from these methods and predictions could have a future impact with regards to rational and structure-based drug design.
Firstly, a biodosimetry is developed using RF to determine absorbed radiation dose from gene expression measured from blood samples of potentially exposed individuals. To improve the prediction accuracy of the biodosimetry, day-specific models were built to deal with day interaction effect and a technique of nested modeling was proposed. The nested models can fit this complex data of large variability and non-linear relationships.
Secondly, a panel of biomarkers was selected using a data-driven feature selection method as well as handpick, considering prior knowledge and other constraints. To incorporate domain knowledge, a method called Know-GRRF was developed based on guided regularized RF. This method can incorporate domain knowledge as a penalized term to regulate selection of candidate features in RF. It adds more flexibility to data-driven feature selection and can improve the interpretability of models. Know-GRRF showed significant improvement in cross-species prediction when cross-species correlation was used to guide selection of biomarkers. The method can also compete with existing methods using intrinsic data characteristics as alternative of domain knowledge in simulated datasets.
Lastly, a novel non-parametric method, RFerr, was developed to generate prediction interval using RF regression. This method is widely applicable to any predictive models and was shown to have better coverage and precision than existing methods on the real-world radiation dataset, as well as benchmark and simulated datasets.