Search Content

Understanding legacy workflows through runtime trace analysis

Description

When scientific software is written to specify processes, it takes the form of a workflow, and is often written in an ad-hoc manner in a dynamic programming language. There is a proliferation of legacy workflows implemented by non-expert programmers due to the accessibility of dynamic languages. Unfortunately, ad-hoc workflows lack…

When scientific software is written to specify processes, it takes the form of a workflow, and is often written in an ad-hoc manner in a dynamic programming language. There is a proliferation of legacy workflows implemented by non-expert programmers due to the accessibility of dynamic languages. Unfortunately, ad-hoc workflows lack a structured description as provided by specialized management systems, making ad-hoc workflow maintenance and reuse difficult, and motivating the need for analysis methods. The analysis of ad-hoc workflows using compiler techniques does not address dynamic languages - a program has so few constrains that its behavior cannot be predicted. In contrast, workflow provenance tracking has had success using run-time techniques to record data. The aim of this work is to develop a new analysis method for extracting workflow structure at run-time, thus avoiding issues with dynamics.

The method captures the dataflow of an ad-hoc workflow through its execution and abstracts it with a process for simplifying repetition. An instrumentation system first processes the workflow to produce an instrumented version, capable of logging events, which is then executed on an input to produce a trace. The trace undergoes dataflow construction to produce a provenance graph. The dataflow is examined for equivalent regions, which are collected into a single unit. The workflow is thus characterized in terms of its treatment of an input. Unlike other methods, a run-time approach characterizes the workflow's actual behavior; including elements which static analysis cannot predict (for example, code dynamically evaluated based on input parameters). This also enables the characterization of dataflow through external tools.

The contributions of this work are: a run-time method for recording a provenance graph from an ad-hoc Python workflow, and a method to analyze the structure of a workflow from provenance. Methods are implemented in Python and are demonstrated on real world Python workflows. These contributions enable users to derive graph structure from workflows. Empowered by a graphical view, users can better understand a legacy workflow. This makes the wealth of legacy ad-hoc workflows accessible, enabling workflow reuse instead of investing time and resources into creating a workflow.

ContributorsAcűna, Ruben (Author) / Bazzi, Rida (Thesis advisor) / Lacroix, Zoé (Thesis advisor) / Candan, Kasim (Committee member) / Arizona State University (Publisher)

Created2015

Analysis of heat dissipation in AlGaN/GaN HEMT with GaN micropits at GaN-SiC interface

Description

Gallium Nitride (GaN) based microelectronics technology is a fast growing and most exciting semiconductor technology in the fields of high power and high frequency electronics. Excellent electrical properties of GaN such as high carrier concentration and high carrier motility makes GaN based high electron mobility transistors (HEMTs) a preferred choice…

Gallium Nitride (GaN) based microelectronics technology is a fast growing and most exciting semiconductor technology in the fields of high power and high frequency electronics. Excellent electrical properties of GaN such as high carrier concentration and high carrier motility makes GaN based high electron mobility transistors (HEMTs) a preferred choice for RF applications. However, a very high temperature in the active region of the GaN HEMT leads to a significant degradation of the device performance by effecting carrier mobility and concentration. Thus, thermal management in GaN HEMT in an effective manner is key to this technology to reach its full potential.

In this thesis, an electro-thermal model of an AlGaN/GaN HEMT on a SiC substrate is simulated using Silvaco (Atlas) TCAD tools. Output characteristics, current density and heat flow at the GaN-SiC interface are key areas of analysis in this work. The electrical characteristics show a sharp drop in drain currents for higher drain voltages. Temperature profile across the device is observed. At the interface of GaN-SiC, there is a sharp drop in temperature indicating a thermal resistance at this interface. Adding to the existing heat in the device, this difference heat is reflected back into the device, further increasing the temperatures in the active region. Structural changes such as GaN micropits, were introduced at the GaN-SiC interface along the length of the device, to make the heat flow smooth rather than discontinuous. With changing dimensions of these micropits, various combinations were tried to reduce the temperature and enhance the device performance. These GaN micropits gave effective results by reducing heat in active region, by spreading out the heat on to the sides of the device rather than just concentrating right below the hot spot. It also helped by allowing a smooth flow of heat at the GaN-SiC interface. There was an increased peak current density in the active region of the device contributing to improved electrical characteristics. In the end, importance of thermal management in these high temperature devices is discussed along with future prospects and a conclusion of this thesis.

ContributorsSuri, Suraj (Author) / Zhao, Yuji (Thesis advisor) / Vasileska, Dragika (Committee member) / Yu, Hongbin (Committee member) / Arizona State University (Publisher)

Created2016

Heavy Metal: Mercury-Use Contamination and Sound in Artisanal and Small-Scale Gold Mining in Antioquia, Colombia

Description

Millions of people around the world daily engage in artisanal and small-scale gold mining (ASGM)––a vital part of total global gold production. For Colombia, this mining accounts for most of the precious metal’s output. It has also made Colombia, per capita, the worst mercury-polluted country in the world. Though cleaner,…

Millions of people around the world daily engage in artisanal and small-scale gold mining (ASGM)––a vital part of total global gold production. For Colombia, this mining accounts for most of the precious metal’s output. It has also made Colombia, per capita, the worst mercury-polluted country in the world. Though cleaner, safer, and more effective methods exist, miners yet opt for mercury-use. Any success with interventions in technology, capacitation, or policy has been limited. This dissertation attends to mercury-use in ASGM in Antioquia, Colombia, via two gaps: a descriptive one (i.e., a failure to pay attention to, and to describe, actual practices in ASGM); and, a theoretical one (i.e., explanations as to why some decisions, including but not limited to policy, succeed or fail). In addition to an ecology of practices, embodiment, and situated knowledges, phenomenological interviews with stakeholders illuminate critical lived experience, as well as whether or how it is possible to reduce mercury-use and contamination. Furthermore, a novel application of speculative sound supplements this work. Finally, key findings complement existing scholarship. The presence of gold drives mining, but an increase in mining comes at a cost. Miners know mercury is hazardous, but mining legally, or formally, has proven too onerous. So, mercury-use persists: it is profitable, and the effects on human health can seem delayed. The state is pivotal to change in mercury-use, but its approach has been punitive. Change will invariably require greater attention to the lived experiences of miners.

ContributorsPimentel, Matthew (Author) / Fonow, Mary Margaret (Thesis advisor) / Parmentier, Mary Jane (Thesis advisor) / Coleman, Grisha (Committee member) / Arizona State University (Publisher)

Created2021

Discovering Partial-Value Associations and Applications

Description

Existing machine learning and data mining techniques have difficulty in handling three characteristics of real-world data sets altogether in a computationally efficient way: (1) different data types with both categorical data and numeric data, (2) different variable relations in different value ranges of variables, and (3) unknown variable dependency.This dissertation…

Existing machine learning and data mining techniques have difficulty in handling three characteristics of real-world data sets altogether in a computationally efficient way: (1) different data types with both categorical data and numeric data, (2) different variable relations in different value ranges of variables, and (3) unknown variable dependency.This dissertation developed a Partial-Value Association Discovery (PVAD) algorithm to overcome the above drawbacks in existing techniques. It also enables the discovery of partial-value and full-value variable associations showing both effects of individual variables and interactive effects of multiple variables. The algorithm is compared with Association rule mining and Decision Tree for validation purposes. The results show that the PVAD algorithm can overcome the shortcomings of existing methods. The second part of this dissertation focuses on knee point detection on noisy data. This extended research topic was inspired during the investigation into categorization for numeric data, which corresponds to Step 1 of the PVAD algorithm. A new mathematical definition of knee point on discrete data is introduced. Due to the unavailability of ground truth data or benchmark data sets, functions used to generate synthetic data are carefully selected and defined. These functions are subsequently employed to create the data sets for this experiment. These synthetic data sets are useful for systematically evaluating and comparing the performance of existing methods. Additionally, a deep-learning model is devised for this problem. Experiments show that the proposed model surpasses existing methods in all synthetic data sets, regardless of whether the samples have single or multiple knee points. The third section presents the application results of the PVAD algorithm to real-world data sets in various domains. These include energy consumption data of an Arizona State University (ASU) building, Computer Network, and ASU Engineering Freshmen Retention. The PVAD algorithm is utilized to create an associative network for energy consumption modeling, analyze univariate and multivariate measures of network flow variables, and identify common and uncommon characteristics related to engineering student retention after their first year at the university. The findings indicate that the PVAD algorithm offers the advantage and capability to uncover variable relationships.

ContributorsFok, Ting Yan (Author) / Ye, Nong (Thesis advisor) / Iquebal, Ashif (Committee member) / Ju, Feng (Committee member) / Collofello, James (Committee member) / Arizona State University (Publisher)

Created2023

Filtering by

Understanding legacy workflows through runtime trace analysis

Analysis of heat dissipation in AlGaN/GaN HEMT with GaN micropits at GaN-SiC interface

Heavy Metal: Mercury-Use Contamination and Sound in Artisanal and Small-Scale Gold Mining in Antioquia, Colombia

Discovering Partial-Value Associations and Applications