Search Content

Exploring Deep Learning for Video Understanding

Description

Video analysis and understanding have obtained more and more attention in recent years. The research community also has devoted considerable effort and made progress in many related visual tasks, like video action/event recognition, thumbnail frame or video index retrieval, and zero-shot learning. The way to find good representative features of…

Video analysis and understanding have obtained more and more attention in recent years. The research community also has devoted considerable effort and made progress in many related visual tasks, like video action/event recognition, thumbnail frame or video index retrieval, and zero-shot learning. The way to find good representative features of videos is an important objective for these visual tasks.

Thanks to the success of deep neural networks in recent vision tasks, it is natural to take the deep learning methods into consideration for better extraction of a global representation of the images and videos. In general, Convolutional Neural Network (CNN) is utilized for obtaining the spatial information, and Recurrent Neural Network (RNN) is leveraged for capturing the temporal information.

This dissertation provides a perspective of the challenging problems in different kinds of videos which may require different solutions. Therefore, several novel deep learning-based approaches of obtaining representative features are outlined for different visual tasks like zero-shot learning, video retrieval, and video event recognition in this dissertation. To better understand and obtained the video spatial and temporal information, Convolutional Neural Network and Recurrent Neural Network are jointly utilized in most approaches. And different experiments are conducted to present the importance and effectiveness of good representative features for obtaining a better knowledge of video clips in the computer vision field. This dissertation also concludes a discussion with possible future works of obtaining better representative features of more challenging video clips.

ContributorsLi, Yikang (Author) / Li, Baoxin BL (Thesis advisor) / Karam, Lina LK (Committee member) / LiKamWa, Robert RL (Committee member) / Yang, Yezhou YY (Committee member) / Arizona State University (Publisher)

Created2020

Domain Concretization from Examples: Addressing Missing Domain Knowledge via Robust Planning

Description

Most planning agents assume complete knowledge of the domain, which may not be the case in scenarios where certain domain knowledge is missing. This problem could be due to design flaws or arise from domain ramifications or qualifications. In such cases, planning algorithms could produce highly undesirable behaviors. Planning with…

Most planning agents assume complete knowledge of the domain, which may not be the case in scenarios where certain domain knowledge is missing. This problem could be due to design flaws or arise from domain ramifications or qualifications. In such cases, planning algorithms could produce highly undesirable behaviors. Planning with incomplete domain knowledge is more challenging than partial observability in the sense that the planning agent is unaware of the existence of such knowledge, in contrast to it being just unobservable or partially observable. That is the difference between known unknowns and unknown unknowns.

In this thesis, I introduce and formulate this as the problem of Domain Concretization, which is inverse to domain abstraction studied extensively before. Furthermore, I present a solution that starts from the incomplete domain model provided to the agent by the designer and uses teacher traces from human users to determine the candidate model set under a minimalistic model assumption. A robust plan is then generated for the maximum probability of success under the set of candidate models. In addition to a standard search formulation in the model-space, I propose a sample-based search method and also an online version of it to improve search time. The solution presented has been evaluated on various International Planning Competition domains where incompleteness was introduced by deleting certain predicates from the complete domain model. The solution is also tested in a robot simulation domain to illustrate its effectiveness in handling incomplete domain knowledge. The results show that the plan generated by the algorithm increases the plan success rate without impacting action cost too much.

ContributorsSharma, Akshay (Author) / Zhang, Yu (Thesis advisor) / Fainekos, Georgios (Committee member) / Srivastava, Siddharth (Committee member) / Arizona State University (Publisher)

Created2020

Understanding Solar Cell Contacts Through Simulations

Description

The maximum theoretical efficiency of a terrestrial non-concentrated silicon solar cell is 29.4%, as obtained from detailed balance analysis. Over 90% of the current silicon photovoltaics market is based on solar cells with diffused junctions (Al-BSF, PERC, PERL, etc.), which are limited in performance by increased non-radiative recombination in the…

The maximum theoretical efficiency of a terrestrial non-concentrated silicon solar cell is 29.4%, as obtained from detailed balance analysis. Over 90% of the current silicon photovoltaics market is based on solar cells with diffused junctions (Al-BSF, PERC, PERL, etc.), which are limited in performance by increased non-radiative recombination in the doped regions. This limitation can be overcome through the use of passivating contacts, which prevent recombination at the absorber interfaces while providing the selectivity to efficiently separate the charge carriers generated in the absorber. This thesis aims at developing an understanding of how the material properties of the contact affect device performance through simulations.The partial specific contact resistance framework developed by Onno et al. aims to link material behavior to device performance specifically at open circuit. In this thesis, the framework is expanded to other operating points of a device, leading to a model for calculating the partial contact resistances at any current flow. The error in calculating these resistances is irrelevant to device performance resulting in an error in calculating fill factor from resistances below 0.1% when the fill factors of the cell are above 70%, i.e., for cells with good passivation and selectivity.
Further, silicon heterojunction (SHJ) and tunnel-oxide based solar cells are simulated in 1D finite-difference modeling package AFORS-HET. The effects of material property changes on device performance are investigated using novel contact materials like Al0.8Ga0.2As (hole contact for SHJ) and ITO (electron contact for tunnel-oxide cells). While changing the bandgap and electron affinity of the contact affect the height of the Schottky barrier and hence contact resistivity, increasing the doping of the contact will increase its selectivity. In the case of ITO, the contact needs to have a work function below 4.2 eV to be electron selective, which suggests that other low work function TCOs (like AZO) will be more applicable as alternative dopant-free electron contacts. The AFORS-HET model also shows that buried doped regions arising from boron diffusion in the absorber can damage passivation and decrease the open circuit voltage of the device.

ContributorsDasgupta, Sagnik (Author) / Holman, Zachary (Thesis advisor) / Onno, Arthur (Committee member) / Wang, Qing Hua (Committee member) / Arizona State University (Publisher)

Created2020

Nurturing Open Design: Challenges and Opportunities for HCI to Support Crowd-driven Hardware Design

Description

Open Design is a crowd-driven global ecosystem which tries to challenge and alter contemporary modes of capitalistic hardware production. It strives to build on the collective skills, expertise and efforts of people regardless of their educational, social or political backgrounds to develop and disseminate physical products, machines and systems. In…

Open Design is a crowd-driven global ecosystem which tries to challenge and alter contemporary modes of capitalistic hardware production. It strives to build on the collective skills, expertise and efforts of people regardless of their educational, social or political backgrounds to develop and disseminate physical products, machines and systems. In contrast to capitalistic hardware production, Open Design practitioners publicly share design files, blueprints and knowhow through various channels including internet platforms and in-person workshops. These designs are typically replicated, modified, improved and reshared by individuals and groups who are broadly referred to as ‘makers’.

This dissertation aims to expand the current scope of Open Design within human-computer interaction (HCI) research through a long-term exploration of Open Design’s socio-technical processes. I examine Open Design from three perspectives: the functional—materials, tools, and platforms that enable crowd-driven open hardware production, the critical—materially-oriented engagements within open design as a site for sociotechnical discourse, and the speculative—crowd-driven critical envisioning of future hardware.

More specifically, this dissertation first explores the growing global scene of Open Design through a long-term ethnographic study of the open science hardware (OScH) movement, a genre of Open Design. This long-term study of OScH provides a focal point for HCI to deeply understand Open Design's growing global landscape. Second, it examines the application of Critical Making within Open Design through an OScH workshop with designers, engineers, artists and makers from local communities. This work foregrounds the role of HCI researchers as facilitators of collaborative critical engagements within Open Design. Third, this dissertation introduces the concept of crowd-driven Design Fiction through the development of a publicly accessible online Design Fiction platform named Dream Drones. Through a six month long development and a study with drone related practitioners, it offers several pragmatic insights into the challenges and opportunities for crowd-driven Design Fiction. Through these explorations, I highlight the broader implications and novel research pathways for HCI to shape and be shaped by the global Open Design movement.

ContributorsFernando, Kattak Kuttige Rex Piyum (Author) / Kuznetsov, Anastasia (Thesis advisor) / Turaga, Pavan (Committee member) / Middel, Ariane (Committee member) / Takamura, John (Committee member) / Arizona State University (Publisher)

Created2020

Image Restoration for Non-Traditional Camera Systems

Description

Cameras have become commonplace with wide-ranging applications of phone photography, computer vision, and medical imaging. With a growing need to reduce size and costs while maintaining image quality, the need to look past traditional style of cameras is becoming more apparent. Several non-traditional cameras have shown to be promising options…

Cameras have become commonplace with wide-ranging applications of phone photography, computer vision, and medical imaging. With a growing need to reduce size and costs while maintaining image quality, the need to look past traditional style of cameras is becoming more apparent. Several non-traditional cameras have shown to be promising options for size-constraint applications, and while they may offer several advantages, they also usually are limited by image quality degradation due to optical or a need to reconstruct a captured image. In this thesis, we take a look at three of these non-traditional cameras: a pinhole camera, a diffusion-mask lensless camera, and an under-display camera (UDC).

For each of these cases, I present a feasible image restoration pipeline to correct for their particular limitations. For the pinhole camera, I present an early pipeline to allow for practical pinhole photography by reducing noise levels caused by low-light imaging, enhancing exposure levels, and sharpening the blur caused by the pinhole. For lensless cameras, we explore a neural network architecture that performs joint image reconstruction and point spread function (PSF) estimation to robustly recover images captured with multiple PSFs from different cameras. Using adversarial learning, this approach achieves improved reconstruction results that do not require explicit knowledge of the PSF at test-time and shows an added improvement in the reconstruction model’s ability to generalize to variations in the camera’s PSF. This allows lensless cameras to be utilized in a wider range of applications that require multiple cameras without the need to explicitly train a separate model for each new camera. For UDCs, we utilize a multi-stage approach to correct for low light transmission, blur, and haze. This pipeline uses a PyNET deep neural network architecture to perform a majority of the restoration, while additionally using a traditional optimization approach which is then fused in a learned manner in the second stage to improve high-frequency features. I show results from this novel fusion approach that is on-par with the state of the art.

ContributorsRego, Joshua D (Author) / Jayasuriya, Suren (Thesis advisor) / Blain Christen, Jennifer (Thesis advisor) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2020

Deep Learning Approaches for Inferring Collective Macrostates from Individual Observations in Natural and Artificial Multi-Agent Systems Under Realistic Constraints

Description

A complex social system, whether artificial or natural, can possess its macroscopic properties as a collective, which may change in real time as a result of local behavioral interactions among a number of agents in it. If a reliable indicator is available to abstract the macrolevel states, decision makers could…

A complex social system, whether artificial or natural, can possess its macroscopic properties as a collective, which may change in real time as a result of local behavioral interactions among a number of agents in it. If a reliable indicator is available to abstract the macrolevel states, decision makers could use it to take a proactive action, whenever needed, in order for the entire system to avoid unacceptable states or con-verge to desired ones. In realistic scenarios, however, there can be many challenges in learning a model of dynamic global states from interactions of agents, such as 1) high complexity of the system itself, 2) absence of holistic perception, 3) variability of group size, 4) biased observations on state space, and 5) identification of salient behavioral cues. In this dissertation, I introduce useful applications of macrostate estimation in complex multi-agent systems and explore effective deep learning frameworks to ad-dress the inherited challenges. First of all, Remote Teammate Localization (ReTLo)is developed in multi-robot teams, in which an individual robot can use its local interactions with a nearby robot as an information channel to estimate the holistic view of the group. Within the problem, I will show (a) learning a model of a modular team can generalize to all others to gain the global awareness of the team of variable sizes, and (b) active interactions are necessary to diversify training data and speed up the overall learning process. The complexity of the next focal system escalates to a colony of over 50 individual ants undergoing 18-day social stabilization since a chaotic event. I will utilize this natural platform to demonstrate, in contrast to (b), (c)monotonic samples only from “before chaos” can be sufficient to model the panicked society, and (d) the model can also be used to discover salient behaviors to precisely predict macrostates.

ContributorsChoi, Taeyeong (Author) / Pavlic, Theodore (Thesis advisor) / Richa, Andrea (Committee member) / Ben Amor, Heni (Committee member) / Yang, Yezhou (Committee member) / Liebig, Juergen (Committee member) / Arizona State University (Publisher)

Created2020

Lateral Programmable Metallization Cells: Materials, Devices and Mechanisms

Description

Lateral programmable metallization cells (PMC) utilize the properties of electrodeposits grown over a solid electrolyte channel. Such devices have an active anode and an inert cathode separated by a long electrodeposit channel in a coplanar arrangement. The ability to transport large amount of metallic mass across the channel makes these…

Lateral programmable metallization cells (PMC) utilize the properties of electrodeposits grown over a solid electrolyte channel. Such devices have an active anode and an inert cathode separated by a long electrodeposit channel in a coplanar arrangement. The ability to transport large amount of metallic mass across the channel makes these devices attractive for various More-Than-Moore applications. Existing literature lacks a comprehensive study of electrodeposit growth kinetics in lateral PMCs. Moreover, the morphology of electrodeposit growth in larger, planar devices is also not understood. Despite the variety of applications, lateral PMCs are not embraced by the semiconductor industry due to incompatible materials and high operating voltages needed for such devices. In this work, a numerical model based on the basic processes in PMCs – cation drift and redox reactions – is proposed, and the effect of various materials parameters on the electrodeposit growth kinetics is reported. The morphology of the electrodeposit growth and kinetics of the electrodeposition process are also studied in devices based on Ag-Ge30Se70 materials system. It was observed that the electrodeposition process mainly consists of two regimes of growth – cation drift limited regime and mixed regime. The electrodeposition starts in cation drift limited regime at low electric fields and transitions into mixed regime as the field increases. The onset of mixed regime can be controlled by applied voltage which also affects the morphology of electrodeposit growth. The numerical model was then used to successfully predict the device kinetics and onset of mixed regime. The problem of materials incompatibility with semiconductor manufacturing was solved by proposing a novel device structure. A bilayer structure using semiconductor foundry friendly materials was suggested as a candidate for solid electrolyte. The bilayer structure consists of a low resistivity oxide shunt layer on top of a high resistivity ion carrying oxide layer. Devices using Cu2O as the low resistivity shunt on top of Cu doped WO3 oxide were fabricated. The bilayer devices provided orders of magnitude improvement in device performance in the context of operating voltage and switching time. Electrical and materials characterization revealed the structure of bilayers and the mechanism of electrodeposition in these devices.

ContributorsChamele, Ninad (Author) / Kozicki, Michael (Thesis advisor) / Barnaby, Hugh (Committee member) / Newman, Nathan (Committee member) / Gonzalez-Velo, Yago (Committee member) / Arizona State University (Publisher)

Created2020

Passivation and Dissolution of Alloys

Description

The passivity of metals is a phenomenon of vast importance as it prevents many materials in important applications from rapid deterioration by corrosion. Alloying with a sufficient quantity of passivating elements (Cr, Al, Si), typically in the range of 10% - 20%, is commonly employed to improve the corrosion resistance…

The passivity of metals is a phenomenon of vast importance as it prevents many materials in important applications from rapid deterioration by corrosion. Alloying with a sufficient quantity of passivating elements (Cr, Al, Si), typically in the range of 10% - 20%, is commonly employed to improve the corrosion resistance of elemental metals. However, the compositional criteria for enhanced corrosion resistance have been a long-standing unanswered question for alloys design. With the emerging interest in multi-principal element alloy design, a percolation model is developed herein for the initial stage of passive film formation, termed primary passivation. The successful validation of the assumptions and predictions of the model in three corrosion-resistant binary alloys, Fe-Cr, Ni-Cr, and Cu-Rh supports that the model which can be used to provide a quantitative design strategy for designing corrosion-resistant alloys. To date, this is the only model that can provide such criteria for alloy design.The model relates alloy passivation to site percolation of the passivating elements in the alloy matrix. In the initial passivation stage, Fe (Ni in Ni-Cr or Cu in Cu-Rh) is selectively dissolved, destroying the passive network built up by Cr (or Rh) oxides and undercutting isolated incipient Cr (Rh) oxide nuclei. The only way to prevent undercutting and form a stable protective passive film is if the concentration of Cr (Rh) is high enough to realize site percolation within the thickness of the passive film or the dissolution depth. This 2D-3D percolation cross-over transition explains the compositional dependent passivation of these alloys. The theoretical description of the transition and its assumptions is examined via experiments and kinetic Monte Carlo simulations. The initial passivation scenario of the dissolution selectivity is validated by the inductively coupled plasma mass spectrum (ICP-MS). The electronic effect not considered in the kinetic Monte Carlo simulations is addressed by density functional theory (DFT). Additionally, the impact of the atomic configuration parameter on alloy passivation is experimentally measured, which turns out to agree well with the model predictions developed using Monte Carlo renormalization group (MC-RNG) methods.

ContributorsXie, Yusi (Author) / Sieradzki, Karl KS (Thesis advisor) / Chan, Candace CC (Committee member) / Wang, Qing QHW (Committee member) / Buttry, Daniel DB (Committee member) / Arizona State University (Publisher)

Created2020

On Feature Saliency and Deep Neural Networks

Description

Technological advances have allowed for the assimilation of a variety of data, driving a shift away from the use of simpler and constrained patterns to more complex and diverse patterns in retrieval and analysis of such data. This shift has inundated the conventional techniques and has stressed the need for…

Technological advances have allowed for the assimilation of a variety of data, driving a shift away from the use of simpler and constrained patterns to more complex and diverse patterns in retrieval and analysis of such data. This shift has inundated the conventional techniques and has stressed the need for intelligent mechanisms that can model the complex patterns in the data. Deep neural networks have shown some success at capturing complex patterns, including the so-called attentioned networks, have significant shortcomings in distinguishing what is important in data from what is noise. This dissertation observes that the traditional neural networks primarily rely solely on gradient-based learning to model deep features maps while ignoring the key insight in the data that can be leveraged as complementary information to help learn an accurate model. In particular, this dissertation shows that the localized multi-scale features (captured implicitly or explicitly) can be leveraged to help improve model performance as these features capture salient informative points in the data.

This dissertation focuses on “working with the data, not just on data”, i.e. leveraging feature saliency through pre-training, in-training, and post-training analysis of the data. In particular, non-neural localized multi-scale feature extraction, in images and time series, are relatively cheap to obtain and can provide a rough overview of the patterns in the data. Furthermore, localized features coupled with deep features can help learn a high performing network. A pre-training analysis of sizes, complexities, and distribution of these localized features can help intelligently allocate a user-provided kernel budget in the network as a single-shot hyper-parameter search. Additionally, these localized features can be used as a secondary input modality to the network for cross-attention. Retraining pre-trained networks can be a costly process, yet, a post-training analysis of model inferences can allow for learning the importance of individual network parameters to the model inferences thus facilitating a retraining-free network sparsification with minimal impact on the model performance. Furthermore, effective in-training analysis of the intermediate features in the network help learn the importance of individual intermediate features (neural attention) and this analysis can be achieved through simulating local-extrema detection or learning features simultaneously and understanding their co-occurrences. In summary, this dissertation argues and establishes that, if appropriately leveraged, localized features and their feature saliency can help learn high-accurate, yet cheaper networks.

ContributorsGarg, Yash (Author) / Candan, K. Selcuk (Thesis advisor) / Davulcu, Hasan (Committee member) / Li, Baoxin (Committee member) / Sapino, Maria Luisa (Committee member) / Arizona State University (Publisher)

Created2020

Exploring the Impact of Augmented Reality on Collaborative Decision-Making in Small Teams

Description

While signiﬁcant qualitative, user study-focused research has been done on augmented reality, relatively few studies have been conducted on multiple, co-located synchronously collaborating users in augmented reality. Recognizing the need for more collaborative user studies in augmented reality and the value such studies present, a user study is conducted of…

While signiﬁcant qualitative, user study-focused research has been done on augmented reality, relatively few studies have been conducted on multiple, co-located synchronously collaborating users in augmented reality. Recognizing the need for more collaborative user studies in augmented reality and the value such studies present, a user study is conducted of collaborative decision-making in augmented reality to investigate the following research question: “Does presenting data visualizations in augmented reality inﬂuence the collaborative decision-making behaviors of a team?” This user study evaluates how viewing data visualizations with augmented reality headsets impacts collaboration in small teams compared to viewing together on a single 2D desktop monitor as a baseline. Teams of two participants performed closed and open-ended evaluation tasks to collaboratively analyze data visualized in both augmented reality and on a desktop monitor. Multiple means of collecting and analyzing data were employed to develop a well-rounded context for results and conclusions, including software logging of participant interactions, qualitative analysis of video recordings of participant sessions, and pre- and post-study participant questionnaires. The results indicate that augmented reality doesn’t signiﬁcantly change the quantity of team member communication but does impact the means and strategies participants use to collaborate.

ContributorsKintscher, Michael (Author) / Bryan, Chris (Thesis advisor) / Amresh, Ashish (Thesis advisor) / Hansford, Dianne (Committee member) / Johnson, Erik (Committee member) / Arizona State University (Publisher)

Created2020

Filtering by