The first topic is an embedded feature selection algorithm termed Expectation-Selection-Maximization (ESM) model that can automatically select features while optimizing the parameters for Gaussian Mixture Model. I introduce a relevancy index (RI) revealing the contribution of the feature in the clustering process to assist feature selection. I demonstrate the efficacy of the ESM by studying two synthetic datasets, four benchmark datasets, and an Alzheimer’s Disease dataset.
The second topic focuses on extending the application of ESM algorithm to handle mixed datatypes. The Gaussian mixture model is generalized to Generalized Model of Mixture (GMoM), which can not only handle continuous features, but also binary and nominal features.
The last topic is about Uncertainty Quantification (UQ) of the feature selection. A new algorithm termed ESOM is proposed, which takes the variance information into consideration while conducting feature selection. Also, a set of outliers are generated in the feature selection process to infer the uncertainty in the input data. Finally, the selected features and detected outlier instances are evaluated by visualization comparison.
Rouse, M. (2018). mHealth (mobile health). Retrieved from https://searchhealthit.techtarget.com/definition/mHealth
The lack of lipidome analytical tools has limited our ability to gain new knowledge about lipid metabolism in microalgae, especially for membrane glycerolipids. An electrospray ionization mass spectrometry-based lipidomics method was developed for Nannochloropsis oceanica IMET1, which resolved 41 membrane glycerolipids molecular species belonging to eight classes. Changes in membrane glycerolipids under nitrogen deprivation and high-light (HL) conditions were uncovered. The results showed that the amount of plastidial membrane lipids including monogalactosyldiacylglycerol, phosphatidylglycerol, and the extraplastidic lipids diacylglyceryl-O-4′-(N, N, N,-trimethyl) homoserine and phosphatidylcholine decreased drastically under HL and nitrogen deprivation stresses. Algal cells accumulated considerably more digalactosyldiacylglycerol and sulfoquinovosyldiacylglycerols under stresses. The genes encoding enzymes responsible for biosynthesis, modification and degradation of glycerolipids were identified by mining a time-course global RNA-seq data set. It suggested that reduction in lipid contents under nitrogen deprivation is not attributable to the retarded biosynthesis processes, at least at the gene expression level, as most genes involved in their biosynthesis were unaffected by nitrogen supply, yet several genes were significantly up-regulated. Additionally, a conceptual eicosapentaenoic acid (EPA) biosynthesis network is proposed based on the lipidomic and transcriptomic data, which underlined import of EPA from cytosolic glycerolipids to the plastid for synthesizing EPA-containing chloroplast membrane lipids.
Background: Genetic profiling represents the future of neuro-oncology but suffers from inadequate biopsies in heterogeneous tumors like Glioblastoma (GBM). Contrast-enhanced MRI (CE-MRI) targets enhancing core (ENH) but yields adequate tumor in only ~60% of cases. Further, CE-MRI poorly localizes infiltrative tumor within surrounding non-enhancing parenchyma, or brain-around-tumor (BAT), despite the importance of characterizing this tumor segment, which universally recurs. In this study, we use multiple texture analysis and machine learning (ML) algorithms to analyze multi-parametric MRI, and produce new images indicating tumor-rich targets in GBM.
Methods: We recruited primary GBM patients undergoing image-guided biopsies and acquired pre-operative MRI: CE-MRI, Dynamic-Susceptibility-weighted-Contrast-enhanced-MRI, and Diffusion Tensor Imaging. Following image coregistration and region of interest placement at biopsy locations, we compared MRI metrics and regional texture with histologic diagnoses of high- vs low-tumor content (≥80% vs <80% tumor nuclei) for corresponding samples. In a training set, we used three texture analysis algorithms and three ML methods to identify MRI-texture features that optimized model accuracy to distinguish tumor content. We confirmed model accuracy in a separate validation set.
Results: We collected 82 biopsies from 18 GBMs throughout ENH and BAT. The MRI-based model achieved 85% cross-validated accuracy to diagnose high- vs low-tumor in the training set (60 biopsies, 11 patients). The model achieved 81.8% accuracy in the validation set (22 biopsies, 7 patients).
Conclusion: Multi-parametric MRI and texture analysis can help characterize and visualize GBM’s spatial histologic heterogeneity to identify regional tumor-rich biopsy targets.