Search Content

The role of mutations in protein structural dynamics and function: a multi-scale computational approach

Description

Proteins are a fundamental unit in biology. Although proteins have been extensively studied, there is still much to investigate. The mechanism by which proteins fold into their native state, how evolution shapes structural dynamics, and the dynamic mechanisms of many diseases are not well understood. In this thesis, protein folding…

Proteins are a fundamental unit in biology. Although proteins have been extensively studied, there is still much to investigate. The mechanism by which proteins fold into their native state, how evolution shapes structural dynamics, and the dynamic mechanisms of many diseases are not well understood. In this thesis, protein folding is explored using a multi-scale modeling method including (i) geometric constraint based simulations that efficiently search for native like topologies and (ii) reservoir replica exchange molecular dynamics, which identify the low free energy structures and refines these structures toward the native conformation. A test set of eight proteins and three ancestral steroid receptor proteins are folded to 2.7Å all-atom RMSD from their experimental crystal structures. Protein evolution and disease associated mutations (DAMs) are most commonly studied by in silico multiple sequence alignment methods. Here, however, the structural dynamics are incorporated to give insight into the evolution of three ancestral proteins and the mechanism of several diseases in human ferritin protein. The differences in conformational dynamics of these evolutionary related, functionally diverged ancestral steroid receptor proteins are investigated by obtaining the most collective motion through essential dynamics. Strikingly, this analysis shows that evolutionary diverged proteins of the same family do not share the same dynamic subspace. Rather, those sharing the same function are simultaneously clustered together and distant from those functionally diverged homologs. This dynamics analysis also identifies 77% of mutations (functional and permissive) necessary to evolve new function. In silico methods for prediction of DAMs rely on differences in evolution rate due to purifying selection and therefore the accuracy of DAM prediction decreases at fast and slow evolvable sites. Here, we investigate structural dynamics through computing the contribution of each residue to the biologically relevant fluctuations and from this define a metric: the dynamic stability index (DSI). Using DSI we study the mechanism for three diseases observed in the human ferritin protein. The T30I and R40G DAMs show a loss of dynamic stability at the C-terminus helix and nearby regulatory loop, agreeing with experimental results implicating the same regulatory loop as a cause in cataracts syndrome.

ContributorsGlembo, Tyler J (Author) / Ozkan, Sefika B (Thesis advisor) / Thorpe, Michael F (Committee member) / Ros, Robert (Committee member) / Kumar, Sudhir (Committee member) / Shumway, John (Committee member) / Arizona State University (Publisher)

Created2011

Multi-task learning via structured regularization: formulations, algorithms, and applications

Description

Multi-task learning (MTL) aims to improve the generalization performance (of the resulting classifiers) by learning multiple related tasks simultaneously. Specifically, MTL exploits the intrinsic task relatedness, based on which the informative domain knowledge from each task can be shared across multiple tasks and thus facilitate the individual task learning. It…

Multi-task learning (MTL) aims to improve the generalization performance (of the resulting classifiers) by learning multiple related tasks simultaneously. Specifically, MTL exploits the intrinsic task relatedness, based on which the informative domain knowledge from each task can be shared across multiple tasks and thus facilitate the individual task learning. It is particularly desirable to share the domain knowledge (among the tasks) when there are a number of related tasks but only limited training data is available for each task. Modeling the relationship of multiple tasks is critical to the generalization performance of the MTL algorithms. In this dissertation, I propose a series of MTL approaches which assume that multiple tasks are intrinsically related via a shared low-dimensional feature space. The proposed MTL approaches are developed to deal with different scenarios and settings; they are respectively formulated as mathematical optimization problems of minimizing the empirical loss regularized by different structures. For all proposed MTL formulations, I develop the associated optimization algorithms to find their globally optimal solution efficiently. I also conduct theoretical analysis for certain MTL approaches by deriving the globally optimal solution recovery condition and the performance bound. To demonstrate the practical performance, I apply the proposed MTL approaches on different real-world applications: (1) Automated annotation of the Drosophila gene expression pattern images; (2) Categorization of the Yahoo web pages. Our experimental results demonstrate the efficiency and effectiveness of the proposed algorithms.

ContributorsChen, Jianhui (Author) / Ye, Jieping (Thesis advisor) / Kumar, Sudhir (Committee member) / Liu, Huan (Committee member) / Xue, Guoliang (Committee member) / Arizona State University (Publisher)

Created2011

Characterizing retinotopic mapping using conformal geometry and Beltrami coefficient: a preliminary study

Description

Functional magnetic resonance imaging (fMRI) has been widely used to measure the retinotopic organization of early visual cortex in the human brain. Previous studies have identified multiple visual field maps (VFMs) based on statistical analysis of fMRI signals, but the resulting geometry has not been fully characterized with mathematical models.…

Functional magnetic resonance imaging (fMRI) has been widely used to measure the retinotopic organization of early visual cortex in the human brain. Previous studies have identified multiple visual field maps (VFMs) based on statistical analysis of fMRI signals, but the resulting geometry has not been fully characterized with mathematical models. This thesis explores using concepts from computational conformal geometry to create a custom software framework for examining and generating quantitative mathematical models for characterizing the geometry of early visual areas in the human brain. The software framework includes a graphical user interface built on top of a selected core conformal flattening algorithm and various software tools compiled specifically for processing and examining retinotopic data. Three conformal flattening algorithms were implemented and evaluated for speed and how well they preserve the conformal metric. All three algorithms performed well in preserving the conformal metric but the speed and stability of the algorithms varied. The software framework performed correctly on actual retinotopic data collected using the standard travelling-wave experiment. Preliminary analysis of the Beltrami coefficient for the early data set shows that selected regions of V1 that contain reasonably smooth eccentricity and polar angle gradients do show significant local conformality, warranting further investigation of this approach for analysis of early and higher visual cortex.

ContributorsTa, Duyan (Author) / Wang, Yalin (Thesis advisor) / Maciejewski, Ross (Committee member) / Wonka, Peter (Committee member) / Arizona State University (Publisher)

Created2013

Combining thickness information with surface tensor-based morphometry for the 3D statistical analysis of the corpus callosum

Description

In blindness research, the corpus callosum (CC) is the most frequently studied sub-cortical structure, due to its important involvement in visual processing. While most callosal analyses from brain structural magnetic resonance images (MRI) are limited to the 2D mid-sagittal slice, we propose a novel framework to capture a complete set…

In blindness research, the corpus callosum (CC) is the most frequently studied sub-cortical structure, due to its important involvement in visual processing. While most callosal analyses from brain structural magnetic resonance images (MRI) are limited to the 2D mid-sagittal slice, we propose a novel framework to capture a complete set of 3D morphological differences in the corpus callosum between two groups of subjects. The CCs are segmented from whole brain T1-weighted MRI and modeled as 3D tetrahedral meshes. The callosal surface is divided into superior and inferior patches on which we compute a volumetric harmonic field by solving the Laplace's equation with Dirichlet boundary conditions. We adopt a refined tetrahedral mesh to compute the Laplacian operator, so our computation can achieve sub-voxel accuracy. Thickness is estimated by tracing the streamlines in the harmonic field. We combine areal changes found using surface tensor-based morphometry and thickness information into a vector at each vertex to be used as a metric for the statistical analysis. Group differences are assessed on this combined measure through Hotelling's T2 test. The method is applied to statistically compare three groups consisting of: congenitally blind (CB), late blind (LB; onset > 8 years old) and sighted (SC) subjects. Our results reveal significant differences in several regions of the CC between both blind groups and the sighted groups; and to a lesser extent between the LB and CB groups. These results demonstrate the crucial role of visual deprivation during the developmental period in reshaping the structural architecture of the CC.

ContributorsXu, Liang (Author) / Wang, Yalin (Thesis advisor) / Maciejewski, Ross (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2013

Structured sparse learning and its applications to biomedical and biological data

Description

Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups…

Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups or graphs. In this thesis, I first propose to solve a sparse learning model with a general group structure, where the predefined groups may overlap with each other. Then, I present three real world applications which can benefit from the group structured sparse learning technique. In the first application, I study the Alzheimer's Disease diagnosis problem using multi-modality neuroimaging data. In this dataset, not every subject has all data sources available, exhibiting an unique and challenging block-wise missing pattern. In the second application, I study the automatic annotation and retrieval of fruit-fly gene expression pattern images. Combined with the spatial information, sparse learning techniques can be used to construct effective representation of the expression images. In the third application, I present a new computational approach to annotate developmental stage for Drosophila embryos in the gene expression images. In addition, it provides a stage score that enables one to more finely annotate each embryo so that they are divided into early and late periods of development within standard stage demarcations. Stage scores help us to illuminate global gene activities and changes much better, and more refined stage annotations improve our ability to better interpret results when expression pattern matches are discovered between genes.

ContributorsYuan, Lei (Author) / Ye, Jieping (Thesis advisor) / Wang, Yalin (Committee member) / Xue, Guoliang (Committee member) / Kumar, Sudhir (Committee member) / Arizona State University (Publisher)

Created2013

Small molecule detection by surface plasmon resonance: improvements in sensitivity and kinetic measurement

Description

Surface plasmon resonance (SPR) has emerged as a popular technique for elucidating subtle signals from biological events in a label-free, high throughput environment. The efficacy of conventional SPR sensors, whose signals are mass-sensitive, diminishes rapidly with the size of the observed target molecules. The following work advances the current SPR…

Surface plasmon resonance (SPR) has emerged as a popular technique for elucidating subtle signals from biological events in a label-free, high throughput environment. The efficacy of conventional SPR sensors, whose signals are mass-sensitive, diminishes rapidly with the size of the observed target molecules. The following work advances the current SPR sensor paradigm for the purpose of small molecule detection. The detection limits of two orthogonal components of SPR measurement are targeted: speed and sensitivity. In the context of this report, speed refers to the dynamic range of measured kinetic rate constants, while sensitivity refers to the target molecule mass limitation of conventional SPR measurement. A simple device for high-speed microfluidic delivery of liquid samples to a sensor surface is presented to address the temporal limitations of conventional SPR measurement. The time scale of buffer/sample switching is on the order of milliseconds, thereby minimizing the opportunity for sample plug dispersion. The high rates of mass transport to and from the central microfluidic sensing region allow for SPR-based kinetic analysis of binding events with dissociation rate constants (kd) up to 130 s-1. The required sample volume is only 1 μL, allowing for minimal sample consumption during high-speed kinetic binding measurement. Charge-based detection of small molecules is demonstrated by plasmonic-based electrochemical impedance microscopy (P-EIM). The dependence of surface plasmon resonance (SPR) on surface charge density is used to detect small molecules (60-120 Da) printed on a dextran-modified sensor surface. The SPR response to an applied ac potential is a function of the surface charge density. This optical signal is comprised of a dc and an ac component, and is measured with high spatial resolution. The amplitude and phase of local surface impedance is provided by the ac component. The phase signal of the small molecules is a function of their charge status, which is manipulated by the pH of a solution. This technique is used to detect and distinguish small molecules based on their charge status, thereby circumventing the mass limitation (~100 Da) of conventional SPR measurement.

ContributorsMacGriff, Christopher Assiff (Author) / Tao, Nongjian (Thesis advisor) / Wang, Shaopeng (Committee member) / LaBaer, Joshua (Committee member) / Chae, Junseok (Committee member) / Arizona State University (Publisher)

Created2013

Multimodal data fusion as a predictior of missing information in social networks

Description

Over 2 billion people are using online social network services, such as Facebook, Twitter, Google+, LinkedIn, and Pinterest. Users update their status, post their photos, share their information, and chat with others in these social network sites every day; however, not everyone shares the same amount of information. This thesis…

Over 2 billion people are using online social network services, such as Facebook, Twitter, Google+, LinkedIn, and Pinterest. Users update their status, post their photos, share their information, and chat with others in these social network sites every day; however, not everyone shares the same amount of information. This thesis explores methods of linking publicly available data sources as a means of extrapolating missing information of Facebook. An application named "Visual Friends Income Map" has been created on Facebook to collect social network data and explore geodemographic properties to link publicly available data, such as the US census data. Multiple predictors are implemented to link data sets and extrapolate missing information from Facebook with accurate predictions. The location based predictor matches Facebook users' locations with census data at the city level for income and demographic predictions. Age and relationship based predictors are created to improve the accuracy of the proposed location based predictor utilizing social network link information. In the case where a user does not share any location information on their Facebook profile, a kernel density estimation location predictor is created. This predictor utilizes publicly available telephone record information of all people with the same surname of this user in the US to create a likelihood distribution of the user's location. This is combined with the user's IP level information in order to narrow the probability estimation down to a local regional constraint.

ContributorsMao, Jingxian (Author) / Maciejewski, Ross (Thesis advisor) / Farin, Gerald (Committee member) / Wang, Yalin (Committee member) / Arizona State University (Publisher)

Created2012

HIV evolution: biogeography and intra-individual dynamics

Description

The entire history of HIV-1 is hidden in its ten thousand bases, where information regarding its evolutionary traversal through the human population can only be unlocked with fine-scale sequence analysis. Measurable footprints of mutation and recombination have imparted upon us a wealth of knowledge, from multiple chimpanzee-to-human transmissions to patterns…

The entire history of HIV-1 is hidden in its ten thousand bases, where information regarding its evolutionary traversal through the human population can only be unlocked with fine-scale sequence analysis. Measurable footprints of mutation and recombination have imparted upon us a wealth of knowledge, from multiple chimpanzee-to-human transmissions to patterns of neutralizing antibody and drug resistance. Extracting maximum understanding from such diverse data can only be accomplished by analyzing the viral population from many angles. This body of work explores two primary aspects of HIV sequence evolution, point mutation and recombination, through cross-sectional (inter-individual) and longitudinal (intra-individual) investigations, respectively. Cross-sectional Analysis: The role of Haiti in the subtype B pandemic has been hotly debated for years; while there have been many studies, up to this point, no one has incorporated the well-known mechanism of retroviral recombination into their biological model. Prior to the use of recombination detection, multiple analyses produced trees where subtype B appears to have first entered Haiti, followed by a jump into the rest of the world. The results presented here contest the Haiti-first theory of the pandemic and instead suggest simultaneous entries of subtype B into Haiti and the rest of the world. Longitudinal Analysis: Potential N-linked glycosylation sites (PNGS) are the most evolutionarily dynamic component of one of the most evolutionarily dynamic proteins known to date. While the number of mutations associated with the increase or decrease of PNGS frequency over time is high, there are a set of relatively stable sites that persist within and between longitudinally sampled individuals. Here, I identify the most conserved stable PNGSs and suggest their potential roles in host-virus interplay. In addition, I have identified, for the first time, what may be a gp-120-based environmental preference for N-linked glycosylation sites.

ContributorsHepp, Crystal Marie, 1981- (Author) / Rosenberg, Michael S. (Thesis advisor) / Hedrick, Philip (Committee member) / Escalante, Ananias (Committee member) / Kumar, Sudhir (Committee member) / Arizona State University (Publisher)

Created2013

Towards a systems biology understanding of metabolic syndrome

Description

This dissertation investigates the condition of skeletal muscle insulin resistance using bioinformatics and computational biology approaches. Drawing from several studies and numerous data sources, I have attempted to uncover molecular mechanisms at multiple levels. From the detailed atomistic simulations of a single protein, to datamining approaches applied at the systems…

This dissertation investigates the condition of skeletal muscle insulin resistance using bioinformatics and computational biology approaches. Drawing from several studies and numerous data sources, I have attempted to uncover molecular mechanisms at multiple levels. From the detailed atomistic simulations of a single protein, to datamining approaches applied at the systems biology level, I provide new targets to explore for the research community. Furthermore I present a new online web resource that unifies various bioinformatics databases to enable discovery of relevant features in 3D protein structures.

ContributorsMielke, Clinton (Author) / Mandarino, Lawrence (Committee member) / LaBaer, Joshua (Committee member) / Magee, D. Mitchell (Committee member) / Dinu, Valentin (Committee member) / Willis, Wayne (Committee member) / Arizona State University (Publisher)

Created2013

A fast fluid simulator using smoothed-particle hydrodynamics

Description

This document presents a new implementation of the Smoothed Particles Hydrodynamics algorithm using DirectX 11 and DirectCompute. The main goal of this document is to present to the reader an alternative solution to the largely studied and researched problem of fluid simulation. Most other solutions have been implemented using the…

This document presents a new implementation of the Smoothed Particles Hydrodynamics algorithm using DirectX 11 and DirectCompute. The main goal of this document is to present to the reader an alternative solution to the largely studied and researched problem of fluid simulation. Most other solutions have been implemented using the NVIDIA CUDA framework; however, the proposed solution in this document uses the Microsoft general-purpose computing on graphics processing units API. The implementation allows for the simulation of a large number of particles in a real-time scenario. The solution presented here uses the Smoothed Particles Hydrodynamics algorithm to calculate the forces within the fluid; this algorithm provides a Lagrangian approach for discretizes the Navier-Stockes equations into a set of particles. Our solution uses the DirectCompute compute shaders to evaluate each particle using the multithreading and multi-core capabilities of the GPU increasing the overall performance. The solution then describes a method for extracting the fluid surface using the Marching Cubes method and the programmable interfaces exposed by the DirectX pipeline. Particularly, this document presents a method for using the Geometry Shader Stage to generate the triangle mesh as defined by the Marching Cubes method. The implementation results show the ability to simulate over 64K particles at a rate of 900 and 400 frames per second, not including the surface reconstruction steps and including the Marching Cubes steps respectively.

ContributorsFigueroa, Gustavo (Author) / Farin, Gerald (Thesis advisor) / Maciejewski, Ross (Committee member) / Wang, Yalin (Committee member) / Arizona State University (Publisher)

Created2012