Search Content

Batch mode active learning for multimedia pattern recognition

Description

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a…

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a large amount of data is cheap and easy, annotating them with class labels is an expensive process in terms of time, labor and human expertise. This has paved the way for research in the field of active learning. Such algorithms automatically select the salient and exemplar instances from large quantities of unlabeled data and are effective in reducing human labeling effort in inducing classification models. To utilize the possible presence of multiple labeling agents, there have been attempts towards a batch mode form of active learning, where a batch of data instances is selected simultaneously for manual annotation. This dissertation is aimed at the development of novel batch mode active learning algorithms to reduce manual effort in training classification models in real world multimedia pattern recognition applications. Four major contributions are proposed in this work: $(i)$ a framework for dynamic batch mode active learning, where the batch size and the specific data instances to be queried are selected adaptively through a single formulation, based on the complexity of the data stream in question, $(ii)$ a batch mode active learning strategy for fuzzy label classification problems, where there is an inherent imprecision and vagueness in the class label definitions, $(iii)$ batch mode active learning algorithms based on convex relaxations of an NP-hard integer quadratic programming (IQP) problem, with guaranteed bounds on the solution quality and $(iv)$ an active matrix completion algorithm and its application to solve several variants of the active learning problem (transductive active learning, multi-label active learning, active feature acquisition and active learning for regression). These contributions are validated on the face recognition and facial expression recognition problems (which are commonly encountered in real world applications like robotics, security and assistive technology for the blind and the visually impaired) and also on collaborative filtering applications like movie recommendation.

ContributorsChakraborty, Shayok (Author) / Panchanathan, Sethuraman (Thesis advisor) / Balasubramanian, Vineeth N. (Committee member) / Li, Baoxin (Committee member) / Mittelmann, Hans (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2013

Simultaneous variable and feature group selection in heterogeneous learning: optimization and applications

Description

Advances in data collection technologies have made it cost-effective to obtain heterogeneous data from multiple data sources. Very often, the data are of very high dimension and feature selection is preferred in order to reduce noise, save computational cost and learn interpretable models. Due to the multi-modality nature of heterogeneous…

Advances in data collection technologies have made it cost-effective to obtain heterogeneous data from multiple data sources. Very often, the data are of very high dimension and feature selection is preferred in order to reduce noise, save computational cost and learn interpretable models. Due to the multi-modality nature of heterogeneous data, it is interesting to design efficient machine learning models that are capable of performing variable selection and feature group (data source) selection simultaneously (a.k.a bi-level selection). In this thesis, I carry out research along this direction with a particular focus on designing efficient optimization algorithms. I start with a unified bi-level learning model that contains several existing feature selection models as special cases. Then the proposed model is further extended to tackle the block-wise missing data, one of the major challenges in the diagnosis of Alzheimer's Disease (AD). Moreover, I propose a novel interpretable sparse group feature selection model that greatly facilitates the procedure of parameter tuning and model selection. Last but not least, I show that by solving the sparse group hard thresholding problem directly, the sparse group feature selection model can be further improved in terms of both algorithmic complexity and efficiency. Promising results are demonstrated in the extensive evaluation on multiple real-world data sets.

ContributorsXiang, Shuo (Author) / Ye, Jieping (Thesis advisor) / Mittelmann, Hans D (Committee member) / Davulcu, Hasan (Committee member) / He, Jingrui (Committee member) / Arizona State University (Publisher)

Created2014

Optimization Models for Iraq’s Water Allocation System

Description

In the recent past, Iraq was considered relatively rich considering its water resources compared to its surroundings. Currently, the magnitude of water resource shortages in Iraq represents an important factor in the stability of the country and in protecting sustained economic development. The need for a practical, applicable, and sustainable…

In the recent past, Iraq was considered relatively rich considering its water resources compared to its surroundings. Currently, the magnitude of water resource shortages in Iraq represents an important factor in the stability of the country and in protecting sustained economic development. The need for a practical, applicable, and sustainable river basin management for the Tigris and Euphrates Rivers in Iraq is essential. Applicable water resources allocation scenarios are important to minimize the potential future water crises in connection with water quality and quantity. The allocation of the available fresh water resources in addition to reclaimed water to different users in a sustainable manner is of the urgent necessities to maintain good water quantity and quality.

In this dissertation, predictive water allocation optimization models were developed which can be used to easily identify good alternatives for water management that can then be discussed, debated, adjusted, and simulated in greater detail. This study provides guidance for decision makers in Iraq for potential future conditions, where water supplies are reduced, and demonstrates how it is feasible to adopt an efficient water allocation strategy with flexibility in providing equitable water resource allocation considering alternative resource. Using reclaimed water will help in reducing the potential negative environmental impacts of treated or/and partially treated wastewater discharges while increasing the potential uses of reclaimed water for agriculture and other applications. Using reclaimed water for irrigation is logical and efficient to enhance the economy of farmers and the environment while providing a diversity of crops, especially since most of Iraq’s built or under construction wastewater treatment plants are located in or adjacent to agricultural lands. Adopting an optimization modelling approach can assist decision makers, ensuring their decisions will benefit the economy by incorporating global experiences to control water allocations in Iraq especially considering diminished water supplies.

ContributorsAhmed, Ahmed Abdulrazzaq (Author) / Mays, Larry W. (Thesis advisor) / Fox, Peter (Thesis advisor) / Mascaro, Giuseppe (Committee member) / Muenich, Rebecca (Committee member) / Arizona State University (Publisher)

Created2019

Optimization model for the design of bioretention basins with dry wells

Description

Bioretention basins are a common stormwater best management practice (BMP) used to mitigate the hydrologic consequences of urbanization. Dry wells, also known as vadose-zone wells, have been used extensively in bioretention basins in Maricopa County, Arizona to decrease total drain time and recharge groundwater. A mixed integer nonlinear programming (MINLP)…

Bioretention basins are a common stormwater best management practice (BMP) used to mitigate the hydrologic consequences of urbanization. Dry wells, also known as vadose-zone wells, have been used extensively in bioretention basins in Maricopa County, Arizona to decrease total drain time and recharge groundwater. A mixed integer nonlinear programming (MINLP) model has been developed for the minimum cost design of bioretention basins with dry wells.

The model developed simultaneously determines the peak stormwater inflow from watershed parameters and optimizes the size of the basin and the number and depth of dry wells based on infiltration, evapotranspiration (ET), and dry well characteristics and cost inputs. The modified rational method is used for the design storm hydrograph, and the Green-Ampt method is used for infiltration. ET rates are calculated using the Penman Monteith method or the Hargreaves-Samani method. The dry well flow rate is determined using an equation developed for reverse auger-hole flow.

The first phase of development of the model is to expand a nonlinear programming (NLP) for the optimal design of infiltration basins for use with bioretention basins. Next a single dry well is added to the NLP bioretention basin optimization model. Finally the number of dry wells in the basin is modeled as an integer variable creating a MINLP problem. The NLP models and MINLP model are solved using the General Algebraic Modeling System (GAMS). Two example applications demonstrate the efficiency and practicality of the model.

ContributorsLacy, Mason (Author) / Mays, Larry W. (Thesis advisor) / Fox, Peter (Committee member) / Wang, Zhihua (Committee member) / Arizona State University (Publisher)

Created2016

Structured sparse methods for imaging genetics

Description

Imaging genetics is an emerging and promising technique that investigates how genetic variations affect brain development, structure, and function. By exploiting disorder-related neuroimaging phenotypes, this class of studies provides a novel direction to reveal and understand the complex genetic mechanisms. Oftentimes, imaging genetics studies are challenging due to the relatively…

Imaging genetics is an emerging and promising technique that investigates how genetic variations affect brain development, structure, and function. By exploiting disorder-related neuroimaging phenotypes, this class of studies provides a novel direction to reveal and understand the complex genetic mechanisms. Oftentimes, imaging genetics studies are challenging due to the relatively small number of subjects but extremely high-dimensionality of both imaging data and genomic data. In this dissertation, I carry on my research on imaging genetics with particular focuses on two tasks---building predictive models between neuroimaging data and genomic data, and identifying disorder-related genetic risk factors through image-based biomarkers. To this end, I consider a suite of structured sparse methods---that can produce interpretable models and are robust to overfitting---for imaging genetics. With carefully-designed sparse-inducing regularizers, different biological priors are incorporated into learning models. More specifically, in the Allen brain image--gene expression study, I adopt an advanced sparse coding approach for image feature extraction and employ a multi-task learning approach for multi-class annotation. Moreover, I propose a label structured-based two-stage learning framework, which utilizes the hierarchical structure among labels, for multi-label annotation. In the Alzheimer's disease neuroimaging initiative (ADNI) imaging genetics study, I employ Lasso together with EDPP (enhanced dual polytope projections) screening rules to fast identify Alzheimer's disease risk SNPs. I also adopt the tree-structured group Lasso with MLFre (multi-layer feature reduction) screening rules to incorporate linkage disequilibrium information into modeling. Moreover, I propose a novel absolute fused Lasso model for ADNI imaging genetics. This method utilizes SNP spatial structure and is robust to the choice of reference alleles of genotype coding. In addition, I propose a two-level structured sparse model that incorporates gene-level networks through a graph penalty into SNP-level model construction. Lastly, I explore a convolutional neural network approach for accurate predicting Alzheimer's disease related imaging phenotypes. Experimental results on real-world imaging genetics applications demonstrate the efficiency and effectiveness of the proposed structured sparse methods.

ContributorsYang, Tao (Author) / Ye, Jieping (Thesis advisor) / Xue, Guoliang (Thesis advisor) / He, Jingrui (Committee member) / Li, Baoxin (Committee member) / Li, Jing (Committee member) / Arizona State University (Publisher)

Created2017

Optimization model for design of vegetative filter strips for stormwater management and sediment control

Description

Vegetative filter strips (VFS) are an effective methodology used for storm water management particularly for large urban parking lots. An optimization model for the design of vegetative filter strips that minimizes the amount of land required for stormwater management using the VFS is developed in this study. The…

Vegetative filter strips (VFS) are an effective methodology used for storm water management particularly for large urban parking lots. An optimization model for the design of vegetative filter strips that minimizes the amount of land required for stormwater management using the VFS is developed in this study. The resulting optimization model is based upon the kinematic wave equation for overland sheet flow along with equations defining the cumulative infiltration and infiltration rate.

In addition to the stormwater management function, Vegetative filter strips (VFS) are effective mechanisms for control of sediment flow and soil erosion from agricultural and urban lands. Erosion is a major problem associated with areas subjected to high runoffs or steep slopes across the globe. In order to effect economy in the design of grass filter strips as a mechanism for sediment control & stormwater management, an optimization model is required that minimizes the land requirements for the VFS. The optimization model presented in this study includes an intricate system of equations including the equations defining the sheet flow on the paved and grassed area combined with the equations defining the sediment transport over the vegetative filter strip using a non-linear programming optimization model. In this study, the optimization model has been applied using a sensitivity analysis of parameters such as different soil types, rainfall characteristics etc., performed to validate the model

ContributorsKhatavkar, Puneet N (Author) / Mays, Larry W. (Thesis advisor) / Fox, Peter (Committee member) / Wang, Zhihua (Committee member) / Mascaro, Giuseppe (Committee member) / Arizona State University (Publisher)

Created2015

Optimization/simulation model for determining real-time optimal operation of river-reservoirs systems during flooding conditions

Description

A model is presented for real-time, river-reservoir operation systems. It epitomizes forward-thinking and efficient approaches to reservoir operations during flooding events. The optimization/simulation includes five major components. The components are a mix of hydrologic and hydraulic modeling, short-term rainfall forecasting, and optimization and reservoir operation models.…

A model is presented for real-time, river-reservoir operation systems. It epitomizes forward-thinking and efficient approaches to reservoir operations during flooding events. The optimization/simulation includes five major components. The components are a mix of hydrologic and hydraulic modeling, short-term rainfall forecasting, and optimization and reservoir operation models. The optimization/simulation model is designed for ultimate accessibility and efficiency. The optimization model uses the meta-heuristic approach, which has the capability to simultaneously search for multiple optimal solutions. The dynamics of the river are simulated by applying an unsteady flow-routing method. The rainfall-runoff simulation uses the National Weather Service NexRad gridded rainfall data, since it provides critical information regarding real storm events. The short-term rainfall-forecasting model utilizes a stochastic method. The reservoir-operation is simulated by a mass-balance approach. The optimization/simulation model offers more possible optimal solutions by using the Genetic Algorithm approach as opposed to traditional gradient methods that can only compute one optimal solution at a time. The optimization/simulation was developed for the 2010 flood event that occurred in the Cumberland River basin in Nashville, Tennessee. It revealed that the reservoir upstream of Nashville was more contained and that an optimal gate release schedule could have significantly decreased the floodwater levels in downtown Nashville. The model is for demonstrative purposes only but is perfectly suitable for real-world application.

ContributorsChe, Daniel C (Author) / Mays, Larry W. (Thesis advisor) / Fox, Peter (Committee member) / Wang, Zhihua (Committee member) / Lansey, Kevin (Committee member) / Wahlin, Brian (Committee member) / Arizona State University (Publisher)

Created2015

Optimization for resource-constrained wireless networks

Description

Nowadays, wireless communications and networks have been widely used in our daily lives. One of the most important topics related to networking research is using optimization tools to improve the utilization of network resources. In this dissertation, we concentrate on optimization for resource-constrained wireless networks, and study two fundamental resource-allocation…

Nowadays, wireless communications and networks have been widely used in our daily lives. One of the most important topics related to networking research is using optimization tools to improve the utilization of network resources. In this dissertation, we concentrate on optimization for resource-constrained wireless networks, and study two fundamental resource-allocation problems: 1) distributed routing optimization and 2) anypath routing optimization. The study on the distributed routing optimization problem is composed of two main thrusts, targeted at understanding distributed routing and resource optimization for multihop wireless networks. The first thrust is dedicated to understanding the impact of full-duplex transmission on wireless network resource optimization. We propose two provably good distributed algorithms to optimize the resources in a full-duplex wireless network. We prove their optimality and also provide network status analysis using dual space information. The second thrust is dedicated to understanding the influence of network entity load constraints on network resource allocation and routing computation. We propose a provably good distributed algorithm to allocate wireless resources. In addition, we propose a new subgradient optimization framework, which can provide findgrained convergence, optimality, and dual space information at each iteration. This framework can provide a useful theoretical foundation for many networking optimization problems. The study on the anypath routing optimization problem is composed of two main thrusts. The first thrust is dedicated to understanding the computational complexity of multi-constrained anypath routing and designing approximate solutions. We prove that this problem is NP-hard when the number of constraints is larger than one. We present two polynomial time K-approximation algorithms. One is a centralized algorithm while the other one is a distributed algorithm. For the second thrust, we study directional anypath routing and present a cross-layer design of MAC and routing. For the MAC layer, we present a directional anycast MAC. For the routing layer, we propose two polynomial time routing algorithms to compute directional anypaths based on two antenna models, and prove their ptimality based on the packet delivery ratio metric.

ContributorsFang, Xi (Author) / Xue, Guoliang (Thesis advisor) / Yau, Sik-Sang (Committee member) / Ye, Jieping (Committee member) / Zhang, Junshan (Committee member) / Arizona State University (Publisher)

Created2013

Filtering by