This collection includes both ASU Theses and Dissertations, submitted by graduate students, and the Barrett, Honors College theses submitted by undergraduate students. 

Displaying 1 - 10 of 78
151341-Thumbnail Image.png
Description
With the rapid development of mobile sensing technologies like GPS, RFID, sensors in smartphones, etc., capturing position data in the form of trajectories has become easy. Moving object trajectory analysis is a growing area of interest these days owing to its applications in various domains such as marketing, security, traffic

With the rapid development of mobile sensing technologies like GPS, RFID, sensors in smartphones, etc., capturing position data in the form of trajectories has become easy. Moving object trajectory analysis is a growing area of interest these days owing to its applications in various domains such as marketing, security, traffic monitoring and management, etc. To better understand movement behaviors from the raw mobility data, this doctoral work provides analytic models for analyzing trajectory data. As a first contribution, a model is developed to detect changes in trajectories with time. If the taxis moving in a city are viewed as sensors that provide real time information of the traffic in the city, a change in these trajectories with time can reveal that the road network has changed. To detect changes, trajectories are modeled with a Hidden Markov Model (HMM). A modified training algorithm, for parameter estimation in HMM, called m-BaumWelch, is used to develop likelihood estimates under assumed changes and used to detect changes in trajectory data with time. Data from vehicles are used to test the method for change detection. Secondly, sequential pattern mining is used to develop a model to detect changes in frequent patterns occurring in trajectory data. The aim is to answer two questions: Are the frequent patterns still frequent in the new data? If they are frequent, has the time interval distribution in the pattern changed? Two different approaches are considered for change detection, frequency-based approach and distribution-based approach. The methods are illustrated with vehicle trajectory data. Finally, a model is developed for clustering and outlier detection in semantic trajectories. A challenge with clustering semantic trajectories is that both numeric and categorical attributes are present. Another problem to be addressed while clustering is that trajectories can be of different lengths and also have missing values. A tree-based ensemble is used to address these problems. The approach is extended to outlier detection in semantic trajectories.
ContributorsKondaveeti, Anirudh (Author) / Runger, George C. (Thesis advisor) / Mirchandani, Pitu (Committee member) / Pan, Rong (Committee member) / Maciejewski, Ross (Committee member) / Arizona State University (Publisher)
Created2012
151511-Thumbnail Image.png
Description
With the increase in computing power and availability of data, there has never been a greater need to understand data and make decisions from it. Traditional statistical techniques may not be adequate to handle the size of today's data or the complexities of the information hidden within the data. Thus

With the increase in computing power and availability of data, there has never been a greater need to understand data and make decisions from it. Traditional statistical techniques may not be adequate to handle the size of today's data or the complexities of the information hidden within the data. Thus knowledge discovery by machine learning techniques is necessary if we want to better understand information from data. In this dissertation, we explore the topics of asymmetric loss and asymmetric data in machine learning and propose new algorithms as solutions to some of the problems in these topics. We also studied variable selection of matched data sets and proposed a solution when there is non-linearity in the matched data. The research is divided into three parts. The first part addresses the problem of asymmetric loss. A proposed asymmetric support vector machine (aSVM) is used to predict specific classes with high accuracy. aSVM was shown to produce higher precision than a regular SVM. The second part addresses asymmetric data sets where variables are only predictive for a subset of the predictor classes. Asymmetric Random Forest (ARF) was proposed to detect these kinds of variables. The third part explores variable selection for matched data sets. Matched Random Forest (MRF) was proposed to find variables that are able to distinguish case and control without the restrictions that exists in linear models. MRF detects variables that are able to distinguish case and control even in the presence of interaction and qualitative variables.
ContributorsKoh, Derek (Author) / Runger, George C. (Thesis advisor) / Wu, Tong (Committee member) / Pan, Rong (Committee member) / Cesta, John (Committee member) / Arizona State University (Publisher)
Created2013
152968-Thumbnail Image.png
Description
Membrane proteins are a vital part of cellular structure. They are directly involved in many important cellular functions, such as uptake, signaling, respiration, and photosynthesis, among others. Despite their importance, however, less than 500 unique membrane protein structures have been determined to date. This is due to several difficulties with

Membrane proteins are a vital part of cellular structure. They are directly involved in many important cellular functions, such as uptake, signaling, respiration, and photosynthesis, among others. Despite their importance, however, less than 500 unique membrane protein structures have been determined to date. This is due to several difficulties with macromolecular crystallography, primarily the difficulty of growing large, well-ordered protein crystals. Since the first proof of concept for femtosecond nanocrystallography showing that diffraction patterns can be collected on extremely small crystals, thus negating the need to grow larger crystals, there have been many exciting advancements in the field. The technique has been proven to show high spatial resolution, thus making it a viable method for structural biology. However, due to the ultrafast nature of the technique, which allows for a lack of radiation damage in imaging, even more interesting experiments are possible, and the first temporal and spatial images of an undamaged structure could be acquired. This concept was denoted as time-resolved femtosecond nanocrystallography.

This dissertation presents on the first time-resolved data set of Photosystem II where structural changes can actually be seen without radiation damage. In order to accomplish this, new crystallization techniques had to be developed so that enough crystals could be made for the liquid jet to deliver a fully hydrated stream of crystals to the high-powered X-ray source. These changes are still in the preliminary stages due to the slightly lower resolution data obtained, but they are still a promising show of the power of this new technique. With further optimization of crystal growth methods and quality, injection technique, and continued development of data analysis software, it is only a matter of time before the ability to make movies of molecules in motion from X-ray diffraction snapshots in time exists. The work presented here is the first step in that process.
ContributorsKupitz, Christopher (Author) / Fromme, Petra (Thesis advisor) / Spence, John C. (Thesis advisor) / Redding, Kevin (Committee member) / Ros, Alexandra (Committee member) / Arizona State University (Publisher)
Created2014
152880-Thumbnail Image.png
Description
The utilization of solar energy requires an efficient means of its storage as fuel. In bio-inspired artificial photosynthesis, light energy can be used to drive water oxidation, but catalysts that produce molecular oxygen from water are required. This dissertation demonstrates a novel complex utilizing earth-abundant Ni in combination with glycine

The utilization of solar energy requires an efficient means of its storage as fuel. In bio-inspired artificial photosynthesis, light energy can be used to drive water oxidation, but catalysts that produce molecular oxygen from water are required. This dissertation demonstrates a novel complex utilizing earth-abundant Ni in combination with glycine as an efficient catalyst with a modest overpotential of 0.475 ± 0.005 V for a current density of 1 mA/cm2 at pH 11. The production of molecular oxygen at a high potential was verified by measurement of the change in oxygen concentration, yielding a Faradaic efficiency of 60 ± 5%. This Ni species can achieve a current density of 4 mA/cm2 that persists for at least 10 hours. Based upon the observed pH dependence of the current amplitude and oxidation/reduction peaks, the catalysis is an electron-proton coupled process. In addition, to investigate the binding of divalent metals to proteins, four peptides were designed and synthesized with carboxylate and histidine ligands. The binding of the metals was characterized by monitoring the metal-induced changes in circular dichroism spectra. Cyclic voltammetry demonstrated that bound copper underwent a Cu(I)/Cu(II) oxidation/reduction change at a potential of approximately 0.32 V in a quasi-reversible process. The relative binding affinity of Mn(II), Fe(II), Co(II), Ni(II) and Cu(II) to the peptides is correlated with the stability constants of the Irving-Williams series for divalent metal ions. A potential application of these complexes of transition metals with amino acids or peptides is in the development of artificial photosynthetic cells.
ContributorsWang, Dong (Author) / Allen, James P. (Thesis advisor) / Ghirlanda, Giovanna (Committee member) / Redding, Kevin (Committee member) / Arizona State University (Publisher)
Created2014
152974-Thumbnail Image.png
Description
Cyanovirin-N (CVN) is a cyanobacterial lectin with potent anti-HIV activity, mediated by binding to the N-linked oligosaccharide moiety of the envelope protein gp120. CVN offers a scaffold to develop multivalent carbohydrate-binding proteins with tunable specificities and affinities. I present here biophysical calculations completed on a monomeric-stabilized mutant of cyanovirin-N, P51G-m4-CVN,

Cyanovirin-N (CVN) is a cyanobacterial lectin with potent anti-HIV activity, mediated by binding to the N-linked oligosaccharide moiety of the envelope protein gp120. CVN offers a scaffold to develop multivalent carbohydrate-binding proteins with tunable specificities and affinities. I present here biophysical calculations completed on a monomeric-stabilized mutant of cyanovirin-N, P51G-m4-CVN, in which domain A binding activity is abolished by four mutations; with comparisons made to CVNmutDB, in which domain B binding activity is abolished. Using Monte Carlo calculations and docking simulations, mutations in CVNmutDB were considered singularly, and the mutations E41A/G and T57A were found to impact the affinity towards dimannose the greatest. 15N-labeled proteins were titrated with Manα(1-2)Manα, while following chemical shift perturbations in NMR spectra. The mutants, E41A/G and T57A, had a larger Kd than P51G-m4-CVN, matching the trends predicted by the calculations. We also observed that the N42A mutation affects the local fold of the binding pocket, thus removing all binding to dimannose. Characterization of the mutant N53S showed similar binding affinity to P51G-m4-CVN. Using biophysical calculations allows us to study future iterations of models to explore affinities and specificities. In order to further elucidate the role of multivalency, I report here a designed covalent dimer of CVN, Nested cyanovirin-N (Nested CVN), which has four binding sites. Nested CVN was found to have comparable binding affinity to gp120 and antiviral activity to wt CVN. These results demonstrate the ability to create a multivalent, covalent dimer that has comparable results to that of wt CVN.

WW domains are small modules consisting of 32-40 amino acids that recognize proline-rich peptides and are found in many signaling pathways. We use WW domain sequences to explore protein folding by simulations using Zipping and Assembly Method. We identified five crucial contacts that enabled us to predict the folding of WW domain sequences based on those contacts. We then designed a folded WW domain peptide from an unfolded WW domain sequence by introducing native contacts at those critical positions.
ContributorsWoodrum, Brian William (Author) / Ghirlanda, Giovanna (Thesis advisor) / Redding, Kevin (Committee member) / Wang, Xu (Committee member) / Arizona State University (Publisher)
Created2014
152988-Thumbnail Image.png
Description
A vast amount of energy emanates from the sun, and at the distance of Earth, approximately 172,500 TW reaches the atmosphere. Of that, 80,600 TW reaches the surface with 15,600 TW falling on land. Photosynthesis converts 156 TW in the form of biomass, which represents all food/fuel for the biosphere

A vast amount of energy emanates from the sun, and at the distance of Earth, approximately 172,500 TW reaches the atmosphere. Of that, 80,600 TW reaches the surface with 15,600 TW falling on land. Photosynthesis converts 156 TW in the form of biomass, which represents all food/fuel for the biosphere with about 20 TW of the total product used by humans. Additionally, our society uses approximately 20 more TW of energy from ancient photosynthetic products i.e. fossil fuels. In order to mitigate climate problems, the carbon dioxide must be removed from the human energy usage by replacement or recycling as an energy carrier. Proposals have been made to process biomass into biofuels; this work demonstrates that current efficiencies of natural photosynthesis are inadequate for this purpose, the effects of fossil fuel replacement with biofuels is ecologically irresponsible, and new technologies are required to operate at sufficient efficiencies to utilize artificial solar-to-fuels systems. Herein a hybrid bioderived self-assembling hydrogen-evolving nanoparticle consisting of photosystem I (PSI) and platinum nanoclusters is demonstrated to operate with an overall efficiency of 6%, which exceeds that of land plants by more than an order of magnitude. The system was limited by the rate of electron donation to photooxidized PSI. Further work investigated the interactions of natural donor acceptor pairs of cytochrome c6 and PSI for the thermophilic cyanobacteria Thermosynechococcus elogantus BP1 and the red alga Galderia sulphuraria. The cyanobacterial system is typified by collisional control while the algal system demonstrates a population of prebound PSI-cytochrome c6 complexes with faster electron transfer rates. Combining the stability of cyanobacterial PSI and kinetics of the algal PSI:cytochrome would result in more efficient solar-to-fuel conversion. A second priority is the replacement of platinum with chemically abundant catalysts. In this work, protein scaffolds are employed using host-guest strategies to increase the stability of proton reduction catalysts and enhance the turnover number without the oxygen sensitivity of hydrogenases. Finally, design of unnatural electron transfer proteins are explored and may introduce a bioorthogonal method of introducing alternative electron transfer pathways in vitro or in vivo in the case of engineered photosynthetic organisms.
ContributorsVaughn, Michael David (Author) / Moore, Thomas (Thesis advisor) / Fromme, Petra (Thesis advisor) / Ghirlanda, Giovanna (Committee member) / Redding, Kevin (Committee member) / Arizona State University (Publisher)
Created2014
153053-Thumbnail Image.png
Description
No-confounding designs (NC) in 16 runs for 6, 7, and 8 factors are non-regular fractional factorial designs that have been suggested as attractive alternatives to the regular minimum aberration resolution IV designs because they do not completely confound any two-factor interactions with each other. These designs allow for potential estimation

No-confounding designs (NC) in 16 runs for 6, 7, and 8 factors are non-regular fractional factorial designs that have been suggested as attractive alternatives to the regular minimum aberration resolution IV designs because they do not completely confound any two-factor interactions with each other. These designs allow for potential estimation of main effects and a few two-factor interactions without the need for follow-up experimentation. Analysis methods for non-regular designs is an area of ongoing research, because standard variable selection techniques such as stepwise regression may not always be the best approach. The current work investigates the use of the Dantzig selector for analyzing no-confounding designs. Through a series of examples it shows that this technique is very effective for identifying the set of active factors in no-confounding designs when there are three of four active main effects and up to two active two-factor interactions.

To evaluate the performance of Dantzig selector, a simulation study was conducted and the results based on the percentage of type II errors are analyzed. Also, another alternative for 6 factor NC design, called the Alternate No-confounding design in six factors is introduced in this study. The performance of this Alternate NC design in 6 factors is then evaluated by using Dantzig selector as an analysis method. Lastly, a section is dedicated to comparing the performance of NC-6 and Alternate NC-6 designs.
ContributorsKrishnamoorthy, Archana (Author) / Montgomery, Douglas C. (Thesis advisor) / Borror, Connie (Thesis advisor) / Pan, Rong (Committee member) / Arizona State University (Publisher)
Created2014
153063-Thumbnail Image.png
Description
Technological advances have enabled the generation and collection of various data from complex systems, thus, creating ample opportunity to integrate knowledge in many decision making applications. This dissertation introduces holistic learning as the integration of a comprehensive set of relationships that are used towards the learning objective. The holistic view

Technological advances have enabled the generation and collection of various data from complex systems, thus, creating ample opportunity to integrate knowledge in many decision making applications. This dissertation introduces holistic learning as the integration of a comprehensive set of relationships that are used towards the learning objective. The holistic view of the problem allows for richer learning from data and, thereby, improves decision making.

The first topic of this dissertation is the prediction of several target attributes using a common set of predictor attributes. In a holistic learning approach, the relationships between target attributes are embedded into the learning algorithm created in this dissertation. Specifically, a novel tree based ensemble that leverages the relationships between target attributes towards constructing a diverse, yet strong, model is proposed. The method is justified through its connection to existing methods and experimental evaluations on synthetic and real data.

The second topic pertains to monitoring complex systems that are modeled as networks. Such systems present a rich set of attributes and relationships for which holistic learning is important. In social networks, for example, in addition to friendship ties, various attributes concerning the users' gender, age, topic of messages, time of messages, etc. are collected. A restricted form of monitoring fails to take the relationships of multiple attributes into account, whereas the holistic view embeds such relationships in the monitoring methods. The focus is on the difficult task to detect a change that might only impact a small subset of the network and only occur in a sub-region of the high-dimensional space of the network attributes. One contribution is a monitoring algorithm based on a network statistical model. Another contribution is a transactional model that transforms the task into an expedient structure for machine learning, along with a generalizable algorithm to monitor the attributed network. A learning step in this algorithm adapts to changes that may only be local to sub-regions (with a broader potential for other learning tasks). Diagnostic tools to interpret the change are provided. This robust, generalizable, holistic monitoring method is elaborated on synthetic and real networks.
ContributorsAzarnoush, Bahareh (Author) / Runger, George C. (Thesis advisor) / Bekki, Jennifer (Thesis advisor) / Pan, Rong (Committee member) / Saghafian, Soroush (Committee member) / Arizona State University (Publisher)
Created2014
153224-Thumbnail Image.png
Description
In this era of fast computational machines and new optimization algorithms, there have been great advances in Experimental Designs. We focus our research on design issues in generalized linear models (GLMs) and functional magnetic resonance imaging(fMRI). The first part of our research is on tackling the challenging problem of constructing

exact

In this era of fast computational machines and new optimization algorithms, there have been great advances in Experimental Designs. We focus our research on design issues in generalized linear models (GLMs) and functional magnetic resonance imaging(fMRI). The first part of our research is on tackling the challenging problem of constructing

exact designs for GLMs, that are robust against parameter, link and model

uncertainties by improving an existing algorithm and providing a new one, based on using a continuous particle swarm optimization (PSO) and spectral clustering. The proposed algorithm is sufficiently versatile to accomodate most popular design selection criteria, and we concentrate on providing robust designs for GLMs, using the D and A optimality criterion. The second part of our research is on providing an algorithm

that is a faster alternative to a recently proposed genetic algorithm (GA) to construct optimal designs for fMRI studies. Our algorithm is built upon a discrete version of the PSO.
ContributorsTemkit, M'Hamed (Author) / Kao, Jason (Thesis advisor) / Reiser, Mark R. (Committee member) / Barber, Jarrett (Committee member) / Montgomery, Douglas C. (Committee member) / Pan, Rong (Committee member) / Arizona State University (Publisher)
Created2014
153167-Thumbnail Image.png
Description
The transmembrane subunit (gp41) of the envelope glycoprotein of HIV-1 associates noncovalently with the surface subunit (gp120) and together they play essential roles in viral mucosal transmission and infection of target cells. The membrane proximal region (MPR, residues 649-683) of gp41 is highly conserved and contains epitopes of broadly neutralizing

The transmembrane subunit (gp41) of the envelope glycoprotein of HIV-1 associates noncovalently with the surface subunit (gp120) and together they play essential roles in viral mucosal transmission and infection of target cells. The membrane proximal region (MPR, residues 649-683) of gp41 is highly conserved and contains epitopes of broadly neutralizing antibodies. The transmembrane (TM) domain (residues 684-705) of gp41 not only anchors the envelope glycoprotein complex in the viral membrane but also dynamically affects the interactions of the MPR with the membrane. While high-resolution X-ray structures of some segments of the MPR were solved in the past, they represent the pre-fusion and post-fusion conformations, most of which could not react with the broadly neutralizing antibodies 2F5 and 4E10. Structural information on the TM domain of gp41 is scant and at low resolution.

This thesis describes the structural studies of MPR-TM (residues 649-705) of HIV-1 gp41 by X-ray crystallography. MPR-TM was fused with different fusion proteins to improve the membrane protein overexpression. The expression level of MPR-TM was improved by fusion to the C-terminus of the Mistic protein, yielding ∼1 mg of pure MPR-TM protein per liter cell culture. The fusion partner Mistic was removed for final crystallization. The isolated MPR-TM protein was biophysically characterized and is a monodisperse candidate for crystallization. However, no crystal with diffraction quality was obtained even after extensive crystallization screens. A novel construct was designed to overexpress MPR-TM as a maltose binding protein (MBP) fusion. About 60 mg of MBP/MPR-TM recombinant protein was obtained from 1 liter of cell culture. Crystals of MBP/MPR-TM recombinant protein could not be obtained when MBP and MPR-TM were separated by a 42 amino acid (aa)-long linker but were obtained after changing the linker to three alanine residues. The crystals diffracted to 2.5 Å after crystallization optimization. Further analysis of the diffraction data indicated that the crystals are twinned. The final structure demonstrated that MBP crystallized as a dimer of trimers, but the electron density did not extend beyond the linker region. We determined by SDS-PAGE and MALDI-TOF MS that the crystals contained MBP only. The MPR-TM of gp41 might be cleaved during or after the process of crystallization. Comparison of the MBP trimer reported here with published trimeric MBP fusion structures indicated that MBP might form such a trimeric conformation under the effect of MPR-TM.
ContributorsGong, Zhen (Author) / Fromme, Petra (Thesis advisor) / Mor, Tsafrir (Thesis advisor) / Ros, Alexandra (Committee member) / Redding, Kevin (Committee member) / Arizona State University (Publisher)
Created2014