Matching Items (12)

156594-Thumbnail Image.png

Remote Sensing and Modeling of Stressed Aquifer Systems and the Associated Hazards

Description

Aquifers host the largest accessible freshwater resource in the world. However, groundwater reserves are declining in many places. Often coincident with drought, high extraction rates and inadequate replenishment result in

Aquifers host the largest accessible freshwater resource in the world. However, groundwater reserves are declining in many places. Often coincident with drought, high extraction rates and inadequate replenishment result in groundwater overdraft and permanent land subsidence. Land subsidence is the cause of aquifer storage capacity reduction, altered topographic gradients which can exacerbate floods, and differential displacement that can lead to earth fissures and infrastructure damage. Improving understanding of the sources and mechanisms driving aquifer deformation is important for resource management planning and hazard mitigation.

Poroelastic theory describes the coupling of differential stress, strain, and pore pressure, which are modulated by material properties. To model these relationships, displacement time series are estimated via satellite interferometry and hydraulic head levels from observation wells provide an in-situ dataset. In combination, the deconstruction and isolation of selected time-frequency components allow for estimating aquifer parameters, including the elastic and inelastic storage coefficients, compaction time constants, and vertical hydraulic conductivity. Together these parameters describe the storage response of an aquifer system to changes in hydraulic head and surface elevation. Understanding aquifer parameters is useful for the ongoing management of groundwater resources.

Case studies in Phoenix and Tucson, Arizona, focus on land subsidence from groundwater withdrawal as well as distinct responses to artificial recharge efforts. In Christchurch, New Zealand, possible changes to aquifer properties due to earthquakes are investigated. In Houston, Texas, flood severity during Hurricane Harvey is linked to subsidence, which modifies base flood elevations and topographic gradients.

Contributors

Agent

Created

Date Created
  • 2018

157322-Thumbnail Image.png

Modeling relationships between cycles in psychology: potential limitations of sinusoidal and mass-spring models

Description

With improvements in technology, intensive longitudinal studies that permit the investigation of daily and weekly cycles in behavior have increased exponentially over the past few decades. Traditionally, when data have

With improvements in technology, intensive longitudinal studies that permit the investigation of daily and weekly cycles in behavior have increased exponentially over the past few decades. Traditionally, when data have been collected on two variables over time, multivariate time series approaches that remove trends, cycles, and serial dependency have been used. These analyses permit the study of the relationship between random shocks (perturbations) in the presumed causal series and changes in the outcome series, but do not permit the study of the relationships between cycles. Liu and West (2016) proposed a multilevel approach that permitted the study of potential between subject relationships between features of the cycles in two series (e.g., amplitude). However, I show that the application of the Liu and West approach is restricted to a small set of features and types of relationships between the series. Several authors (e.g., Boker & Graham, 1998) proposed a connected mass-spring model that appears to permit modeling of more general cyclic relationships. I showed that the undamped connected mass-spring model is also limited and may be unidentified. To test the severity of the restrictions of the motion trajectories producible by the undamped connected mass-spring model I mathematically derived their connection to the force equations of the undamped connected mass-spring system. The mathematical solution describes the domain of the trajectory pairs that are producible by the undamped connected mass-spring model. The set of producible trajectory pairs is highly restricted, and this restriction sets major limitations on the application of the connected mass-spring model to psychological data. I used a simulation to demonstrate that even if a pair of psychological time-varying variables behaved exactly like two masses in an undamped connected mass-spring system, the connected mass-spring model would not yield adequate parameter estimates. My simulation probed the performance of the connected mass-spring model as a function of several aspects of data quality including number of subjects, series length, sampling rate relative to the cycle, and measurement error in the data. The findings can be extended to damped and nonlinear connected mass-spring systems.

Contributors

Agent

Created

Date Created
  • 2019

154174-Thumbnail Image.png

Multi-variate time series similarity measures and their robustness against temporal asynchrony

Description

The amount of time series data generated is increasing due to the integration of sensor technologies with everyday applications, such as gesture recognition, energy optimization, health care, video surveillance. The

The amount of time series data generated is increasing due to the integration of sensor technologies with everyday applications, such as gesture recognition, energy optimization, health care, video surveillance. The use of multiple sensors simultaneously

for capturing different aspects of the real world attributes has also led to an increase in dimensionality from uni-variate to multi-variate time series. This has facilitated richer data representation but also has necessitated algorithms determining similarity between two multi-variate time series for search and analysis.

Various algorithms have been extended from uni-variate to multi-variate case, such as multi-variate versions of Euclidean distance, edit distance, dynamic time warping. However, it has not been studied how these algorithms account for asynchronous in time series. Human gestures, for example, exhibit asynchrony in their patterns as different subjects perform the same gesture with varying movements in their patterns at different speeds. In this thesis, we propose several algorithms (some of which also leverage metadata describing the relationships among the variates). In particular, we present several techniques that leverage the contextual relationships among the variates when measuring multi-variate time series similarities. Based on the way correlation is leveraged, various weighing mechanisms have been proposed that determine the importance of a dimension for discriminating between the time series as giving the same weight to each dimension can led to misclassification. We next study the robustness of the considered techniques against different temporal asynchronies, including shifts and stretching.

Exhaustive experiments were carried on datasets with multiple types and amounts of temporal asynchronies. It has been observed that accuracy of algorithms that rely on data to discover variate relationships can be low under the presence of temporal asynchrony, whereas in case of algorithms that rely on external metadata, robustness against asynchronous distortions tends to be stronger. Specifically, algorithms using external metadata have better classification accuracy and cluster separation than existing state-of-the-art work, such as EROS, PCA, and naive dynamic time warping.

Contributors

Agent

Created

Date Created
  • 2015

151226-Thumbnail Image.png

Modeling time series data for supervised learning

Description

Temporal data are increasingly prevalent and important in analytics. Time series (TS) data are chronological sequences of observations and an important class of temporal data. Fields such as medicine, finance,

Temporal data are increasingly prevalent and important in analytics. Time series (TS) data are chronological sequences of observations and an important class of temporal data. Fields such as medicine, finance, learning science and multimedia naturally generate TS data. Each series provide a high-dimensional data vector that challenges the learning of the relevant patterns This dissertation proposes TS representations and methods for supervised TS analysis. The approaches combine new representations that handle translations and dilations of patterns with bag-of-features strategies and tree-based ensemble learning. This provides flexibility in handling time-warped patterns in a computationally efficient way. The ensemble learners provide a classification framework that can handle high-dimensional feature spaces, multiple classes and interaction between features. The proposed representations are useful for classification and interpretation of the TS data of varying complexity. The first contribution handles the problem of time warping with a feature-based approach. An interval selection and local feature extraction strategy is proposed to learn a bag-of-features representation. This is distinctly different from common similarity-based time warping. This allows for additional features (such as pattern location) to be easily integrated into the models. The learners have the capability to account for the temporal information through the recursive partitioning method. The second contribution focuses on the comprehensibility of the models. A new representation is integrated with local feature importance measures from tree-based ensembles, to diagnose and interpret time intervals that are important to the model. Multivariate time series (MTS) are especially challenging because the input consists of a collection of TS and both features within TS and interactions between TS can be important to models. Another contribution uses a different representation to produce computationally efficient strategies that learn a symbolic representation for MTS. Relationships between the multiple TS, nominal and missing values are handled with tree-based learners. Applications such as speech recognition, medical diagnosis and gesture recognition are used to illustrate the methods. Experimental results show that the TS representations and methods provide better results than competitive methods on a comprehensive collection of benchmark datasets. Moreover, the proposed approaches naturally provide solutions to similarity analysis, predictive pattern discovery and feature selection.

Contributors

Agent

Created

Date Created
  • 2012

155984-Thumbnail Image.png

Mathematical Models of Androgen Resistance in Prostate Cancer Patients under Intermittent Androgen Suppression Therapy

Description

Predicting resistant prostate cancer is critical for lowering medical costs and improving the quality of life of advanced prostate cancer patients. I formulate, compare, and analyze two mathematical models that

Predicting resistant prostate cancer is critical for lowering medical costs and improving the quality of life of advanced prostate cancer patients. I formulate, compare, and analyze two mathematical models that aim to forecast future levels of prostate-specific antigen (PSA). I accomplish these tasks by employing clinical data of locally advanced prostate cancer patients undergoing androgen deprivation therapy (ADT). I demonstrate that the inverse problem of parameter estimation might be too complicated and simply relying on data fitting can give incorrect conclusions, since there is a large error in parameter values estimated and parameters might be unidentifiable. I provide confidence intervals to give estimate forecasts using data assimilation via an ensemble Kalman Filter. Using the ensemble Kalman Filter, I perform dual estimation of parameters and state variables to test the prediction accuracy of the models. Finally, I present a novel model with time delay and a delay-dependent parameter. I provide a geometric stability result to study the behavior of this model and show that the inclusion of time delay may improve the accuracy of predictions. Also, I demonstrate with clinical data that the inclusion of the delay-dependent parameter facilitates the identification and estimation of parameters.

Contributors

Agent

Created

Date Created
  • 2017

155732-Thumbnail Image.png

Seasonal and tilt angle dependence of soiling loss factor and development of artificial soil deposition chamber replicating natural dew cycle

Description

This is a two-part thesis. Part 1 presents the seasonal and tilt angle dependence of soiling loss factor of photovoltaic (PV) modules over two years for Mesa, Arizona (a desert

This is a two-part thesis. Part 1 presents the seasonal and tilt angle dependence of soiling loss factor of photovoltaic (PV) modules over two years for Mesa, Arizona (a desert climatic condition). Part 2 presents the development of an indoor artificial soil deposition chamber replicating natural dew cycle. Several environmental factors affect the performance of PV systems including soiling. Soiling on PV modules results in a decrease of sunlight reaching the solar cell, thereby reducing the current and power output. Dust particles, air pollution particles, pollen, bird droppings and other industrial airborne particles are some natural sources that cause soiling. The dust particles vary from one location to the other in terms of particle size, color, and chemical composition. The thickness and properties of the soil layer determine the optical path of light through the soil/glass interface. Soil accumulation on the glass surface is also influenced by environmental factors such as dew, wind speeds and rainfall. Studies have shown that soil deposition is closely related to tilt angle and exposure period before a rain event. The first part of this thesis analyzes the reduction in irradiance transmitted to a solar cell through the air/soil/glass in comparison to a clean cell (air/glass interface). A time series representation is used to compare seasonal soiling loss factors for two consecutive years (2014-2016). The effect of tilt angle and rain events on these losses are extensively analyzed. Since soiling is a significant field issue, there is a growing need to address the problem, and several companies have come up with solutions such as anti-soiling coatings, automated cleaning systems etc. To test and validate the effectiveness of these anti-soiling coating technologies, various research institutes around the world are working on the design and development of artificial indoor soiling chambers to replicate the natural process in the field. The second part of this thesis work deals with the design and development of an indoor artificial soiling chamber that replicates natural soil deposition process in the field.

Contributors

Agent

Created

Date Created
  • 2017

155769-Thumbnail Image.png

The theory of narrative conflict

Description

Speculation regarding interstate conflict is of great concern to many, if not, all people. As such, forecasting interstate conflict has been an interest to experts, scholars, government officials, and concerned

Speculation regarding interstate conflict is of great concern to many, if not, all people. As such, forecasting interstate conflict has been an interest to experts, scholars, government officials, and concerned citizens. Presently, there are two approaches to the problem of conflict forecasting with divergent results. The first tends to use a bird’s eye view with big data to forecast actions while missing the intimate details of the groups it is studying. The other opts for more grounded details of cultural meaning and interpretation, yet struggles in the realm of practical application for forecasting. While outlining issues with both approaches, an important question surfaced: are actions causing interpretations and/or are the interpretations driving actions? In response, the Theory of Narrative Conflict (TNC) is proposed to begin answering these questions. To properly address the complexity of forecasting and of culture, TNC draws from a number of different sources, including narrative theory, systems theory, nationalism, and the expression of these in strategic communication.

As a case study, this dissertation examines positions of both the U.S. and China in the South and East China Seas over five years. Methodologically, this dissertation demonstrates the benefit of content analysis to identify local narratives and both stabilizing and destabilizing events contained in thousands of news articles over a five-year period. Additionally, the use of time series and a Markov analysis both demonstrate usefulness in forecasting. Theoretically, TNC displays the usefulness of narrative theory to forecast both actions driven by narrative and common interpretations after events.

Practically, this dissertation demonstrates that current efforts in the U.S. and China have not resulted in an increased understanding of the other country. Neither media giant demonstrates the capacity to be critical of their own national identity and preferred interpretation of world affairs. In short, the battle for the hearts and minds of foreign persons should be challenged.

Contributors

Agent

Created

Date Created
  • 2017

154956-Thumbnail Image.png

Defects and statistical degradation analysis of photovoltaic power plants

Description

As the photovoltaic (PV) power plants age in the field, the PV modules degrade and generate visible and invisible defects. A defect and statistical degradation rate analysis of photovoltaic (PV)

As the photovoltaic (PV) power plants age in the field, the PV modules degrade and generate visible and invisible defects. A defect and statistical degradation rate analysis of photovoltaic (PV) power plants is presented in two-part thesis. The first part of the thesis deals with the defect analysis and the second part of the thesis deals with the statistical degradation rate analysis. In the first part, a detailed analysis on the performance or financial risk related to each defect found in multiple PV power plants across various climatic regions of the USA is presented by assigning a risk priority number (RPN). The RPN for all the defects in each PV plant is determined based on two databases: degradation rate database; defect rate database. In this analysis it is determined that the RPN for each plant is dictated by the technology type (crystalline silicon or thin-film), climate and age. The PV modules aging between 3 and 19 years in four different climates of hot-dry, hot-humid, cold-dry and temperate are investigated in this study.

In the second part, a statistical degradation analysis is performed to determine if the degradation rates are linear or not in the power plants exposed in a hot-dry climate for the crystalline silicon technologies. This linearity degradation analysis is performed using the data obtained through two methods: current-voltage method; metered kWh method. For the current-voltage method, the annual power degradation data of hundreds of individual modules in six crystalline silicon power plants of different ages is used. For the metered kWh method, a residual plot analysis using Winters’ statistical method is performed for two crystalline silicon plants of different ages. The metered kWh data typically consists of the signal and noise components. Smoothers remove the noise component from the data by taking the average of the current and the previous observations. Once this is done, a residual plot analysis of the error component is performed to determine the noise was successfully separated from the data by proving the noise is random.

Contributors

Agent

Created

Date Created
  • 2016

154246-Thumbnail Image.png

Reconstructing and cotrolling nonlinear complex systems

Description

The power of science lies in its ability to infer and predict the

existence of objects from which no direct information can be obtained

experimentally or observationally. A well known example is

The power of science lies in its ability to infer and predict the

existence of objects from which no direct information can be obtained

experimentally or observationally. A well known example is to

ascertain the existence of black holes of various masses in different

parts of the universe from indirect evidence, such as X-ray emissions.

In the field of complex networks, the problem of detecting

hidden nodes can be stated, as follows. Consider a network whose

topology is completely unknown but whose nodes consist of two types:

one accessible and another inaccessible from the outside world. The

accessible nodes can be observed or monitored, and it is assumed that time

series are available from each node in this group. The inaccessible

nodes are shielded from the outside and they are essentially

``hidden.'' The question is, based solely on the

available time series from the accessible nodes, can the existence and

locations of the hidden nodes be inferred? A completely data-driven,

compressive-sensing based method is developed to address this issue by utilizing

complex weighted networks of nonlinear oscillators, evolutionary game

and geospatial networks.

Both microbes and multicellular organisms actively regulate their cell

fate determination to cope with changing environments or to ensure

proper development. Here, the synthetic biology approaches are used to

engineer bistable gene networks to demonstrate that stochastic and

permanent cell fate determination can be achieved through initializing

gene regulatory networks (GRNs) at the boundary between dynamic

attractors. This is experimentally realized by linking a synthetic GRN

to a natural output of galactose metabolism regulation in yeast.

Combining mathematical modeling and flow cytometry, the

engineered systems are shown to be bistable and that inherent gene expression

stochasticity does not induce spontaneous state transitioning at

steady state. By interfacing rationally designed synthetic

GRNs with background gene regulation mechanisms, this work

investigates intricate properties of networks that illuminate possible

regulatory mechanisms for cell differentiation and development that

can be initiated from points of instability.

Contributors

Agent

Created

Date Created
  • 2015

156679-Thumbnail Image.png

Machine Learning Models for High-dimensional Biomedical Data

Description

The recent technological advances enable the collection of various complex, heterogeneous and high-dimensional data in biomedical domains. The increasing availability of the high-dimensional biomedical data creates the needs of new

The recent technological advances enable the collection of various complex, heterogeneous and high-dimensional data in biomedical domains. The increasing availability of the high-dimensional biomedical data creates the needs of new machine learning models for effective data analysis and knowledge discovery. This dissertation introduces several unsupervised and supervised methods to help understand the data, discover the patterns and improve the decision making. All the proposed methods can generalize to other industrial fields.

The first topic of this dissertation focuses on the data clustering. Data clustering is often the first step for analyzing a dataset without the label information. Clustering high-dimensional data with mixed categorical and numeric attributes remains a challenging, yet important task. A clustering algorithm based on tree ensembles, CRAFTER, is proposed to tackle this task in a scalable manner.

The second part of this dissertation aims to develop data representation methods for genome sequencing data, a special type of high-dimensional data in the biomedical domain. The proposed data representation method, Bag-of-Segments, can summarize the key characteristics of the genome sequence into a small number of features with good interpretability.

The third part of this dissertation introduces an end-to-end deep neural network model, GCRNN, for time series classification with emphasis on both the accuracy and the interpretation. GCRNN contains a convolutional network component to extract high-level features, and a recurrent network component to enhance the modeling of the temporal characteristics. A feed-forward fully connected network with the sparse group lasso regularization is used to generate the final classification and provide good interpretability.

The last topic centers around the dimensionality reduction methods for time series data. A good dimensionality reduction method is important for the storage, decision making and pattern visualization for time series data. The CRNN autoencoder is proposed to not only achieve low reconstruction error, but also generate discriminative features. A variational version of this autoencoder has great potential for applications such as anomaly detection and process control.

Contributors

Agent

Created

Date Created
  • 2018