Search Content

Knowledge and Reasoning for Image Understanding

Description

Image Understanding is a long-established discipline in computer vision, which encompasses a body of advanced image processing techniques, that are used to locate (“where”), characterize and recognize (“what”) objects, regions, and their attributes in the image. However, the notion of “understanding” (and the goal of artificial intelligent machines) goes beyond…

Image Understanding is a long-established discipline in computer vision, which encompasses a body of advanced image processing techniques, that are used to locate (“where”), characterize and recognize (“what”) objects, regions, and their attributes in the image. However, the notion of “understanding” (and the goal of artificial intelligent machines) goes beyond factual recall of the recognized components and includes reasoning and thinking beyond what can be seen (or perceived). Understanding is often evaluated by asking questions of increasing difficulty. Thus, the expected functionalities of an intelligent Image Understanding system can be expressed in terms of the functionalities that are required to answer questions about an image. Answering questions about images require primarily three components: Image Understanding, question (natural language) understanding, and reasoning based on knowledge. Any question, asking beyond what can be directly seen, requires modeling of commonsense (or background/ontological/factual) knowledge and reasoning.

Knowledge and reasoning have seen scarce use in image understanding applications. In this thesis, we demonstrate the utilities of incorporating background knowledge and using explicit reasoning in image understanding applications. We first present a comprehensive survey of the previous work that utilized background knowledge and reasoning in understanding images. This survey outlines the limited use of commonsense knowledge in high-level applications. We then present a set of vision and reasoning-based methods to solve several applications and show that these approaches benefit in terms of accuracy and interpretability from the explicit use of knowledge and reasoning. We propose novel knowledge representations of image, knowledge acquisition methods, and a new implementation of an efficient probabilistic logical reasoning engine that can utilize publicly available commonsense knowledge to solve applications such as visual question answering, image puzzles. Additionally, we identify the need for new datasets that explicitly require external commonsense knowledge to solve. We propose the new task of Image Riddles, which requires a combination of vision, and reasoning based on ontological knowledge; and we collect a sufficiently large dataset to serve as an ideal testbed for vision and reasoning research. Lastly, we propose end-to-end deep architectures that can combine vision, knowledge and reasoning modules together and achieve large performance boosts over state-of-the-art methods.

ContributorsAditya, Somak (Author) / Baral, Chitta (Thesis advisor) / Yang, Yezhou (Thesis advisor) / Aloimonos, Yiannis (Committee member) / Lee, Joohyung (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2018

Data-Driven Representation Learning in Multimodal Feature Fusion

Description

Modern machine learning systems leverage data and features from multiple modalities to gain more predictive power. In most scenarios, the modalities are vastly different and the acquired data are heterogeneous in nature. Consequently, building highly effective fusion algorithms is at the core to achieve improved model robustness and inferencing performance.…

Modern machine learning systems leverage data and features from multiple modalities to gain more predictive power. In most scenarios, the modalities are vastly different and the acquired data are heterogeneous in nature. Consequently, building highly effective fusion algorithms is at the core to achieve improved model robustness and inferencing performance. This dissertation focuses on the representation learning approaches as the fusion strategy. Specifically, the objective is to learn the shared latent representation which jointly exploit the structural information encoded in all modalities, such that a straightforward learning model can be adopted to obtain the prediction.

We first consider sensor fusion, a typical multimodal fusion problem critical to building a pervasive computing platform. A systematic fusion technique is described to support both multiple sensors and descriptors for activity recognition. Targeted to learn the optimal combination of kernels, Multiple Kernel Learning (MKL) algorithms have been successfully applied to numerous fusion problems in computer vision etc. Utilizing the MKL formulation, next we describe an auto-context algorithm for learning image context via the fusion with low-level descriptors. Furthermore, a principled fusion algorithm using deep learning to optimize kernel machines is developed. By bridging deep architectures with kernel optimization, this approach leverages the benefits of both paradigms and is applied to a wide variety of fusion problems.

In many real-world applications, the modalities exhibit highly specific data structures, such as time sequences and graphs, and consequently, special design of the learning architecture is needed. In order to improve the temporal modeling for multivariate sequences, we developed two architectures centered around attention models. A novel clinical time series analysis model is proposed for several critical problems in healthcare. Another model coupled with triplet ranking loss as metric learning framework is described to better solve speaker diarization. Compared to state-of-the-art recurrent networks, these attention-based multivariate analysis tools achieve improved performance while having a lower computational complexity. Finally, in order to perform community detection on multilayer graphs, a fusion algorithm is described to derive node embedding from word embedding techniques and also exploit the complementary relational information contained in each layer of the graph.

ContributorsSong, Huan (Author) / Spanias, Andreas (Thesis advisor) / Thiagarajan, Jayaraman (Committee member) / Berisha, Visar (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Arizona State University (Publisher)

Created2018

Explainable Fact Checking by Combining Automated Rule Discovery with Probabilistic Answer Set Programming

Description

The goal of fact checking is to determine if a given claim holds. A promising ap- proach for this task is to exploit reference information in the form of knowledge graphs (KGs), a structured and formal representation of knowledge with semantic descriptions of entities and relations. KGs are successfully used…

The goal of fact checking is to determine if a given claim holds. A promising ap- proach for this task is to exploit reference information in the form of knowledge graphs (KGs), a structured and formal representation of knowledge with semantic descriptions of entities and relations. KGs are successfully used in multiple appli- cations, but the information stored in a KG is inevitably incomplete. In order to address the incompleteness problem, this thesis proposes a new method built on top of recent results in logical rule discovery in KGs called RuDik and a probabilistic extension of answer set programs called LPMLN.

This thesis presents the integration of RuDik which discovers logical rules over a given KG and LPMLN to do probabilistic inference to validate a fact. While automatically discovered rules over a KG are for human selection and revision, they can be turned into LPMLN programs with a minor modification. Leveraging the probabilistic inference in LPMLN, it is possible to (i) derive new information which is not explicitly stored in a KG with a probability associated with it, and (ii) provide supporting facts and rules for interpretable explanations for such decisions.

Also, this thesis presents experiments and results to show that this approach can label claims with high precision. The evaluation of the system also sheds light on the role played by the quality of the given rules and the quality of the KG.

ContributorsPradhan, Anish (Author) / Lee, Joohyung (Thesis advisor) / Baral, Chitta (Committee member) / Papotti, Paolo (Committee member) / Arizona State University (Publisher)

Created2018

GeoSparkSim: A Scalable Microscopic Road Network Traffic Simulator Based on Apache Spark

Description

Researchers and practitioners have widely studied road network traffic data in different areas such as urban planning, traffic prediction and spatial-temporal databases. For instance, researchers use such data to evaluate the impact of road network changes. Unfortunately, collecting large-scale high-quality urban traffic data requires tremendous efforts because participating vehicles must…

Researchers and practitioners have widely studied road network traffic data in different areas such as urban planning, traffic prediction and spatial-temporal databases. For instance, researchers use such data to evaluate the impact of road network changes. Unfortunately, collecting large-scale high-quality urban traffic data requires tremendous efforts because participating vehicles must install Global Positioning System(GPS) receivers and administrators must continuously monitor these devices. There have been some urban traffic simulators trying to generate such data with different features. However, they suffer from two critical issues (1) Scalability: most of them only offer single-machine solution which is not adequate to produce large-scale data. Some simulators can generate traffic in parallel but do not well balance the load among machines in a cluster. (2) Granularity: many simulators do not consider microscopic traffic situations including traffic lights, lane changing, car following. This paper proposed GeoSparkSim, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize large-scale urban traffic data. To implement microscopic traffic models, GeoSparkSim employs a simulation-aware vehicle partitioning method to partition vehicles among different machines such that each machine has a balanced workload. The experimental analysis shows that GeoSparkSim can simulate the movements of 200 thousand cars over an extensive road network (250 thousand road junctions and 300 thousand road segments).

ContributorsFu, Zishan (Author) / Sarwat, Mohamed (Thesis advisor) / Pedrielli, Giulia (Committee member) / Sefair, Jorge (Committee member) / Arizona State University (Publisher)

Created2019

Physics-Based Lidar Simulation and Wind Gust Detection and Impact Prediction for Wind Turbines

Description

Lidar has demonstrated its utility in meteorological studies, wind resource assessment, and wind farm control. More recently, lidar has gained widespread attention for autonomous vehicles.

The first part of the dissertation begins with an application of a coherent Doppler lidar to wind gust characterization for wind farm control. This application focuses…

Lidar has demonstrated its utility in meteorological studies, wind resource assessment, and wind farm control. More recently, lidar has gained widespread attention for autonomous vehicles.

The first part of the dissertation begins with an application of a coherent Doppler lidar to wind gust characterization for wind farm control. This application focuses on wind gusts on a scale from 100 m to 1000 m. A detecting and tracking algorithm is proposed to extract gusts from a wind field and track their movement. The algorithm was implemented for a three-hour, two-dimensional wind field retrieved from the measurements of a coherent Doppler lidar. The Gaussian distribution of the gust spanwise deviation from the streamline was demonstrated. Size dependency of gust deviations is discussed. A prediction model estimating the impact of gusts with respect to arrival time and the probability of arrival locations is introduced. The prediction model was applied to a virtual wind turbine array, and estimates are given for which wind turbines would be impacted.

The second part of this dissertation describes a Time-of-Flight lidar simulation. The lidar simulation includes a laser source module, a propagation module, a receiver module, and a timing module. A two-dimensional pulse model is introduced in the laser source module. The sampling rate for the pulse model is explored. The propagation module takes accounts of beam divergence, target characteristics, atmosphere, and optics. The receiver module contains models of noise and analog filters in a lidar receiver. The effect of analog filters on the signal behavior was investigated. The timing module includes a Time-to-Digital Converter (TDC) module and an Analog-to-Digital converter (ADC) module. In the TDC module, several walk-error compensation methods for leading-edge detection and multiple timing algorithms were modeled and tested on simulated signals. In the ADC module, a benchmark (BM) timing algorithm is proposed. A Neyman-Pearson (NP) detector was implemented in the time domain and frequency domain (fast Fourier transform (FFT) approach). The FFT approach with frequency-domain zero-paddings improves the timing resolution. The BM algorithm was tested on simulated signals, and the NP detector was evaluated on both simulated signals and measurements from a prototype lidar (Bhaskaran, 2018).

ContributorsZhou, Kai (Author) / Calhoun, Ronald (Thesis advisor) / Chen, Kangping (Committee member) / Tang, Wenbo (Committee member) / Peet, Yulia (Committee member) / Krishnamurthy, Raghavendra (Committee member) / Arizona State University (Publisher)

Created2019

Forward Osmosis Desalination Using Thermoresponsive Hydrogels as Draw Agents; An Experimental Study

Description

Hydrogel polymers have been the subject of many studies, due to their fascinating ability to alternate between being hydrophilic and hydrophobic, upon the application of appropriate stimuli. In particular, thermo-responsive hydrogels such as N-Isopropylacrylamide (NIPAM), which possess a unique lower critical solution temperature (LCST) of 32°C, have been leveraged for…

Hydrogel polymers have been the subject of many studies, due to their fascinating ability to alternate between being hydrophilic and hydrophobic, upon the application of appropriate stimuli. In particular, thermo-responsive hydrogels such as N-Isopropylacrylamide (NIPAM), which possess a unique lower critical solution temperature (LCST) of 32°C, have been leveraged for membrane-based processes such as using NIPAM as a draw agent for forward osmosis (FO) desalination. The low LCST temperature of NIPAM ensures that fresh water can be recovered, at a modest energy cost as compared to other thermally based desalination processes which require water recovery at higher temperatures. This work studies by experimentation, key process parameters involved in desalination by FO using NIPAM and a copolymer of NIPAM and Sodium Acrylate (NIPAM-SA). It encompasses synthesis of the hydrogels, development of experiments to effectively characterize synthesized products, and the measuring of FO performance for the individual hydrogels. FO performance was measured using single layers of NIPAM and NIPAM-SA respectively. The values of permeation flux obtained were compared to relevant published literature and it was found to be within reasonable range. Furthermore, a conceptual design for future large-scale implementation of this technology is proposed. It is proposed that perhaps more effort should focus on physical processes that have the ability to increase the low permeation flux of hydrogel driven FO desalination systems, rather than development of novel classes of hydrogels

ContributorsAbdullahi, Adnan None (Author) / Phelan, Patrick (Thesis advisor) / Wang, Robert (Committee member) / Dai, Lenore (Committee member) / Arizona State University (Publisher)

Created2019

Design, Simulation and Testing of a Controller And Software Framework for Automated Construction by a Robotic Manipulator

Description

The construction industry is very mundane and tiring for workers without the assistance of machines. This challenge has changed the trend of construction industry tremendously by motivating the development of robots that can replace human workers. This thesis presents a computed torque controller that is designed to produce movements by…

The construction industry is very mundane and tiring for workers without the assistance of machines. This challenge has changed the trend of construction industry tremendously by motivating the development of robots that can replace human workers. This thesis presents a computed torque controller that is designed to produce movements by a small-scale, 5 degree-of-freedom (DOF) robotic arm that are useful for construction operations, specifically bricklaying. A software framework for the robotic arm with motion and path planning features and different control capabilities has also been developed using the Robot Operating System (ROS).

First, a literature review of bricklaying construction activity and existing robots’ performance is discussed. After describing an overview of the required robot structure, a mathematical model is presented for the 5-DOF robotic arm. A model-based computed torque controller is designed for the nonlinear dynamic robotic arm, taking into consideration the dynamic and kinematic properties of the arm. For sustainable growth of this technology so that it is affordable to the masses, it is important that the energy consumption by the robot is optimized. In this thesis, the trajectory of the robotic arm is optimized using sequential quadratic programming. The results of the energy optimization procedure are also analyzed for different possible trajectories.

A construction testbed setup is simulated in the ROS platform to validate the designed controllers and optimized robot trajectories on different experimental scenarios. A commercially available 5-DOF robotic arm is modeled in the ROS simulators Gazebo and Rviz. The path and motion planning is performed using the Moveit-ROS interface and also implemented on a physical small-scale robotic arm. A Matlab-ROS framework for execution of different controllers on the physical robot is described. Finally, the results of the controller simulation and experiments are discussed in detail.

ContributorsGandhi, Sushrut (Author) / Berman, Spring (Thesis advisor) / Marvi, Hamidreza (Committee member) / Yong, Sze Zheng (Committee member) / Arizona State University (Publisher)

Created2019

Evaluation of Properties of Triply Periodic Minimal Surface Structures Using ANSYS

Description

The advancements in additive manufacturing have made it possible to bring life to designs

that would otherwise exist only on paper. An excellent example of such designs

are the Triply Periodic Minimal Surface (TPMS) structures like Schwarz D, Schwarz

P, Gyroid, etc. These structures are self-sustaining, i.e. they require minimal supports

or no supports…

The advancements in additive manufacturing have made it possible to bring life to designs

that would otherwise exist only on paper. An excellent example of such designs

are the Triply Periodic Minimal Surface (TPMS) structures like Schwarz D, Schwarz

P, Gyroid, etc. These structures are self-sustaining, i.e. they require minimal supports

or no supports at all when 3D printed. These structures exist in stable form in

nature, like butterfly wings are made of Gyroids. Automotive and aerospace industry

have a growing demand for strong and light structures, which can be solved using

TPMS models. In this research we will try and understand some of the properties of

these Triply Periodic Minimal Surface (TPMS) structures and see how they perform

in comparison to the conventional models. The research was concentrated on the

mechanical, thermal and fluid flow properties of the Schwarz D, Gyroid and Spherical

Gyroid Triply Periodic Minimal Surface (TPMS) models in particular, other Triply

Periodic Minimal Surface (TPMS) models were not considered. A detailed finite

element analysis was performed on the mechanical and thermal properties using ANSYS

19.2 and the flow properties were analyzed using ANSYS Fluent under different

conditions.

ContributorsRaja, Faisal (Author) / Phelan, Patrick (Thesis advisor) / Bhate, Dhruv (Committee member) / Rykaczewski, Konrad (Committee member) / Arizona State University (Publisher)

Created2019

Multiobjective Optimization Based Approach for Truth Discovery

Description

There are many applications where the truth is unknown. The truth values are

guessed by different sources. The values of different properties can be obtained from

various sources. These will lead to the disagreement in sources. An important task

is to obtain the truth from these sometimes contradictory sources. In the extension

of computing…

There are many applications where the truth is unknown. The truth values are

guessed by different sources. The values of different properties can be obtained from

various sources. These will lead to the disagreement in sources. An important task

is to obtain the truth from these sometimes contradictory sources. In the extension

of computing the truth, the reliability of sources needs to be computed. There are

models which compute the precision values. In those earlier models Banerjee et al.

(2005) Dong and Naumann (2009) Kasneci et al. (2011) Li et al. (2012) Marian and

Wu (2011) Zhao and Han (2012) Zhao et al. (2012), multiple properties are modeled

individually. In one of the existing works, the heterogeneous properties are modeled in

a joined way. In that work, the framework i.e. Conflict Resolution on Heterogeneous

Data (CRH) framework is based on the single objective optimization. Due to the

single objective optimization and non-convex optimization problem, only one local

optimal solution is found. As this is a non-convex optimization problem, the optimal

point depends upon the initial point. This single objective optimization problem is

converted into a multi-objective optimization problem. Due to the multi-objective

optimization problem, the Pareto optimal points are computed. In an extension of

that, the single objective optimization problem is solved with numerous initial points.

The above two approaches are used for finding the solution better than the solution

obtained in the CRH with median as the initial point for the continuous variables and

majority voting as the initial point for the categorical variables. In the experiments,

the solution, coming from the CRH, lies in the Pareto optimal points of the multiobjective

optimization and the solution coming from the CRH is the optimum solution

in these experiments.

ContributorsJain, Karan (Author) / Xue, Guoliang (Thesis advisor) / Sen, Arunabha (Committee member) / Sarwat, Mohamed (Committee member) / Arizona State University (Publisher)

Created2019

Kuwait residential energy outlook: modeling the diffusion of energy conservation measures

Description

The residential building sector accounts for more than 26% of the global energy consumption and 17% of global CO2 emissions. Due to the low cost of electricity in Kuwait and increase of population, Kuwaiti electricity consumption tripled during the past 30 years and is expected to increase by 20% by…

The residential building sector accounts for more than 26% of the global energy consumption and 17% of global CO2 emissions. Due to the low cost of electricity in Kuwait and increase of population, Kuwaiti electricity consumption tripled during the past 30 years and is expected to increase by 20% by 2027. In this dissertation, a framework is developed to assess energy savings techniques to help policy-makers make educated decisions. The Kuwait residential energy outlook is studied by modeling the baseline energy consumption and the diffusion of energy conservation measures (ECMs) to identify the impacts on household energy consumption and CO2 emissions.

The energy resources and power generation in Kuwait were studied. The characteristics of the residential buildings along with energy codes of practice were investigated and four building archetypes were developed. Moreover, a baseline of end-use electricity consumption and demand was developed. Furthermore, the baseline energy consumption and demand were projected till 2040. It was found that by 2040, energy consumption would double with most of the usage being from AC. While with lighting, there is a negligible increase in consumption due to a projected shift towards more efficient lighting. Peak demand loads are expected to increase by an average growth rate of 2.9% per year. Moreover, the diffusion of different ECMs in the residential sector was modeled through four diffusion scenarios to estimate ECM adoption rates. ECMs’ impact on CO2 emissions and energy consumption of residential buildings in Kuwait was evaluated and the cost of conserved energy (CCE) and annual energy savings for each measure was calculated. AC ECMs exhibited the highest cumulative savings, whereas lighting ECMs showed an immediate energy impact. None of the ECMs in the study were cost effective due to the high subsidy rate (95%), therefore, the impact of ECMs at different subsidy and rebate rates was studied. At 75% subsidized utility price and 40% rebate only on appliances, most of ECMs will be cost effective with high energy savings. Moreover, by imposing charges of $35/ton of CO2, most ECMs will be cost effective.

ContributorsAlajmi, Turki (Author) / Phelan, Patrick E (Thesis advisor) / Kaloush, Kamil (Committee member) / Huang, Huei-Ping (Committee member) / Wang, Liping (Committee member) / Hajiah, Ali (Committee member) / Arizona State University (Publisher)

Created2019

Filtering by