Search Content

Efficient Java native interface for android based mobile devices

Description

Currently Java is making its way into the embedded systems and mobile devices like androids. The programs written in Java are compiled into machine independent binary class byte codes. A Java Virtual Machine (JVM) executes these classes. The Java platform additionally specifies the Java Native Interface (JNI). JNI allows Java…

Currently Java is making its way into the embedded systems and mobile devices like androids. The programs written in Java are compiled into machine independent binary class byte codes. A Java Virtual Machine (JVM) executes these classes. The Java platform additionally specifies the Java Native Interface (JNI). JNI allows Java code that runs within a JVM to interoperate with applications or libraries that are written in other languages and compiled to the host CPU ISA. JNI plays an important role in embedded system as it provides a mechanism to interact with libraries specific to the platform. This thesis addresses the overhead incurred in the JNI due to reflection and serialization when objects are accessed on android based mobile devices. It provides techniques to reduce this overhead. It also provides an API to access objects through its reference through pinning its memory location. The Android emulator was used to evaluate the performance of these techniques and we observed that there was 5 - 10 % performance gain in the new Java Native Interface.

ContributorsChandrian, Preetham (Author) / Lee, Yann-Hang (Thesis advisor) / Davulcu, Hasan (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2011

Enhancing the usability of complex structured data by supporting keyword searches

Description

As pointed out in the keynote speech by H. V. Jagadish in SIGMOD'07, and also commonly agreed in the database community, the usability of structured data by casual users is as important as the data management systems' functionalities. A major hardness of using structured data is the problem of easily…

As pointed out in the keynote speech by H. V. Jagadish in SIGMOD'07, and also commonly agreed in the database community, the usability of structured data by casual users is as important as the data management systems' functionalities. A major hardness of using structured data is the problem of easily retrieving information from them given a user's information needs. Learning and using a structured query language (e.g., SQL and XQuery) is overwhelmingly burdensome for most users, as not only are these languages sophisticated, but the users need to know the data schema. Keyword search provides us with opportunities to conveniently access structured data and potentially significantly enhances the usability of structured data. However, processing keyword search on structured data is challenging due to various types of ambiguities such as structural ambiguity (keyword queries have no structure), keyword ambiguity (the keywords may not be accurate), user preference ambiguity (the user may have implicit preferences that are not indicated in the query), as well as the efficiency challenges due to large search space. This dissertation performs an expansive study on keyword search processing techniques as a gateway for users to access structured data and retrieve desired information. The key issues addressed include: (1) Resolving structural ambiguities in keyword queries by generating meaningful query results, which involves identifying relevant keyword matches, identifying return information, composing query results based on relevant matches and return information. (2) Resolving structural, keyword and user preference ambiguities through result analysis, including snippet generation, result differentiation, result clustering, result summarization/query expansion, etc. (3) Resolving the efficiency challenge in processing keyword search on structured data by utilizing and efficiently maintaining materialized views. These works deliver significant technical contributions towards building a full-fledged search engine for structured data.

ContributorsLiu, Ziyang (Author) / Chen, Yi (Thesis advisor) / Candan, Kasim S (Committee member) / Davulcu, Hasan (Committee member) / Jagadish, H V (Committee member) / Arizona State University (Publisher)

Created2011

Association based prioritization of genes

Description

Genes have widely different pertinences to the etiology and pathology of diseases. Thus, they can be ranked according to their disease-significance on a genomic scale, which is the subject of gene prioritization. Given a set of genes known to be related to a disease, it is reasonable to use them…

Genes have widely different pertinences to the etiology and pathology of diseases. Thus, they can be ranked according to their disease-significance on a genomic scale, which is the subject of gene prioritization. Given a set of genes known to be related to a disease, it is reasonable to use them as a basis to determine the significance of other candidate genes, which will then be ranked based on the association they exhibit with respect to the given set of known genes. Experimental and computational data of various kinds have different reliability and relevance to a disease under study. This work presents a gene prioritization method based on integrated biological networks that incorporates and models the various levels of relevance and reliability of diverse sources. The method is shown to achieve significantly higher performance as compared to two well-known gene prioritization algorithms. Essentially, no bias in the performance was seen as it was applied to diseases of diverse ethnology, e.g., monogenic, polygenic and cancer. The method was highly stable and robust against significant levels of noise in the data. Biological networks are often sparse, which can impede the operation of associationbased gene prioritization algorithms such as the one presented here from a computational perspective. As a potential approach to overcome this limitation, we explore the value that transcription factor binding sites can have in elucidating suitable targets. Transcription factors are needed for the expression of most genes, especially in higher organisms and hence genes can be associated via their genetic regulatory properties. While each transcription factor recognizes specific DNA sequence patterns, such patterns are mostly unknown for many transcription factors. Even those that are known are inconsistently reported in the literature, implying a potentially high level of inaccuracy. We developed computational methods for prediction and improvement of transcription factor binding patterns. Tests performed on the improvement method by employing synthetic patterns under various conditions showed that the method is very robust and the patterns produced invariably converge to nearly identical series of patterns. Preliminary tests were conducted to incorporate knowledge from transcription factor binding sites into our networkbased model for prioritization, with encouraging results. Genes have widely different pertinences to the etiology and pathology of diseases. Thus, they can be ranked according to their disease-significance on a genomic scale, which is the subject of gene prioritization. Given a set of genes known to be related to a disease, it is reasonable to use them as a basis to determine the significance of other candidate genes, which will then be ranked based on the association they exhibit with respect to the given set of known genes. Experimental and computational data of various kinds have different reliability and relevance to a disease under study. This work presents a gene prioritization method based on integrated biological networks that incorporates and models the various levels of relevance and reliability of diverse sources. The method is shown to achieve significantly higher performance as compared to two well-known gene prioritization algorithms. Essentially, no bias in the performance was seen as it was applied to diseases of diverse ethnology, e.g., monogenic, polygenic and cancer. The method was highly stable and robust against significant levels of noise in the data. Biological networks are often sparse, which can impede the operation of associationbased gene prioritization algorithms such as the one presented here from a computational perspective. As a potential approach to overcome this limitation, we explore the value that transcription factor binding sites can have in elucidating suitable targets. Transcription factors are needed for the expression of most genes, especially in higher organisms and hence genes can be associated via their genetic regulatory properties. While each transcription factor recognizes specific DNA sequence patterns, such patterns are mostly unknown for many transcription factors. Even those that are known are inconsistently reported in the literature, implying a potentially high level of inaccuracy. We developed computational methods for prediction and improvement of transcription factor binding patterns. Tests performed on the improvement method by employing synthetic patterns under various conditions showed that the method is very robust and the patterns produced invariably converge to nearly identical series of patterns. Preliminary tests were conducted to incorporate knowledge from transcription factor binding sites into our networkbased model for prioritization, with encouraging results. To validate these approaches in a disease-specific context, we built a schizophreniaspecific network based on the inferred associations and performed a comprehensive prioritization of human genes with respect to the disease. These results are expected to be validated empirically, but computational validation using known targets are very positive.

ContributorsLee, Jang (Author) / Gonzalez, Graciela (Thesis advisor) / Ye, Jieping (Committee member) / Davulcu, Hasan (Committee member) / Gallitano-Mendel, Amelia (Committee member) / Arizona State University (Publisher)

Created2011

A low-energy, low-cost field deployable sampler for microbial DNA profiling

Description

Filtration for microfluidic sample-collection devices is desirable for sample selection, concentration, preprocessing, and downstream manipulation, but microfabricating the required sub-micrometer filtration structure is an elaborate process. This thesis presents a simple method to fabricate polydimethylsiloxane (PDMS) devices with an integrated membrane filter that will sample, lyse, and extract the DNA…

Filtration for microfluidic sample-collection devices is desirable for sample selection, concentration, preprocessing, and downstream manipulation, but microfabricating the required sub-micrometer filtration structure is an elaborate process. This thesis presents a simple method to fabricate polydimethylsiloxane (PDMS) devices with an integrated membrane filter that will sample, lyse, and extract the DNA from microorganisms in aqueous environments. An off-the-shelf membrane filter disc was embedded in a PDMS layer and sequentially bound with other PDMS channel layers. No leakage was observed during filtration. This device was validated by concentrating a large amount of cyanobacterium Synechocystis in simulated sample water with consistent performance across devices. After accumulating sufficient biomass on the filter, a sequential electrochemical lysing process was performed by applying 5VDC across the filter. This device was further evaluated by delivering several samples of differing concentrations of cyanobacterium Synechocystis then quantifying the DNA using real-time PCR. Lastly, an environmental sample was run through the device and the amount of photosynthetic microorganisms present in the water was determined. The major breakthroughs in this design are low energy demand, cheap materials, simple design, straightforward fabrication, and robust performance, together enabling wide-utility of similar chip-based devices for field-deployable operations in environmental micro-biotechnology.

ContributorsLecluse, Aurelie (Author) / Meldrum, Deirdre (Thesis advisor) / Chao, Joseph (Thesis advisor) / Westerhoff, Paul (Committee member) / Arizona State University (Publisher)

Created2011

Arsenic Sorption by Iron Impregnated Biochar

Description

Much of Nepal lacks access to clean drinking water, and many water sources are contaminated with arsenic at concentrations above both World Health Organization and local Nepalese guidelines. While many water treatment technologies exist, it is necessary to identify those that are easily implementable in developing areas. One simple treatment…

Much of Nepal lacks access to clean drinking water, and many water sources are contaminated with arsenic at concentrations above both World Health Organization and local Nepalese guidelines. While many water treatment technologies exist, it is necessary to identify those that are easily implementable in developing areas. One simple treatment that has gained popularity is biochar—a porous, carbon-based substance produced through pyrolysis of biomass in an oxygen-free environment. Arizona State University’s Engineering Projects in Community Service (EPICS) has partnered with communities in Nepal in an attempt to increase biochar production in the area, as it has several valuable applications including water treatment. Biochar’s arsenic adsorption capability will be investigated in this project with the goal of using the biochar that Nepalese communities produce to remove water contaminants. It has been found in scientific literature that biochar is effective in removing heavy metal contaminants from water with the addition of iron through surface activation. Thus, the specific goal of this research was to compare the arsenic adsorption disparity between raw biochar and iron-impregnated biochar. It was hypothesized that after numerous bed volumes pass through a water treatment column, iron from the source water will accumulate on the surface of raw biochar, mimicking the intentionally iron-impregnated biochar and further increasing contaminant uptake. It is thus an additional goal of this project to compare biochar loaded with iron through an iron-spiked water column and biochar impregnated with iron through surface oxidation. For this investigation, the biochar was crushed and sieved to a size between 90 and 100 micrometers. Two samples were prepared: raw biochar and oxidized biochar. The oxidized biochar was impregnated with iron through surface oxidation with potassium permanganate and iron loading. Then, X-ray fluorescence was used to compare the composition of the oxidized biochar with its raw counterpart, indicating approximately 0.5% iron in the raw and 1% iron in the oxidized biochar. The biochar samples were then added to batches of arsenic-spiked water at iron to arsenic concentration ratios of 20 mg/L:1 mg/L and 50 mg/L:1 mg/L to determine adsorption efficiency. Inductively coupled plasma mass spectrometry (ICP-MS) analysis indicated an 86% removal of arsenic using a 50:1 ratio of iron to arsenic (1.25 g biochar required in 40 mL solution), and 75% removal with a 20:1 ratio (0.5 g biochar required in 40 mL solution). Additional samples were then inserted into a column process apparatus for further adsorption analysis. Again, ICP-MS analysis was performed and the results showed that while both raw and treated biochars were capable of adsorbing arsenic, they were exhausted after less than 70 bed volumes (234 mL), with raw biochar lasting 60 bed volumes (201 mL) and oxidized about 70 bed volumes (234 mL). Further research should be conducted to investigate more affordable and less laboratory-intensive processes to prepare biochar for water treatment.

ContributorsLaird, Ashlyn (Author) / Schoepf, Jared (Thesis director) / Westerhoff, Paul (Committee member) / Chemical Engineering Program (Contributor) / School of International Letters and Cultures (Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

Design of a Cable Driven Drone for Perching

Description

The majority of drones are extremely simple, their functions include flight and sometimes recording video and audio. While drone technology has continued to improve these functions, particularly flight, additional functions have not been added to mainstream drones. Although these basic functions serve as a good framework for drone designs, it…

The majority of drones are extremely simple, their functions include flight and sometimes recording video and audio. While drone technology has continued to improve these functions, particularly flight, additional functions have not been added to mainstream drones. Although these basic functions serve as a good framework for drone designs, it is now time to extend off from this framework. With this Honors Thesis project, we introduce a new function intended to eventually become common to drones. This feature is a grasping mechanism that is capable of perching on branches and carrying loads within the weight limit. This concept stems from the natural behavior of many kinds of insects. It paves the way for drones to further imitate the natural design of flying creatures. Additionally, it serves to advocate for dynamic drone frames, or morphing drone frames, to become more common practice in drone designs.

ContributorsMacias, Jose Carlos (Co-author) / Goldenberg, Edward Bradley (Co-author) / Downey, Matthew (Co-author) / Zhang, Wenlong (Thesis director) / Aukes, Daniel (Committee member) / Human Systems Engineering (Contributor) / Engineering Programs (Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

Machine Learning of Real and Pseudo Physics: Modeling Dynamical Systems

Description

The research presented in this Honors Thesis provides development in machine learning models which predict future states of a system with unknown dynamics, based on observations of the system. Two case studies are presented for (1) a non-conservative pendulum and (2) a differential game dictating a two-car uncontrolled intersection scenario.…

The research presented in this Honors Thesis provides development in machine learning models which predict future states of a system with unknown dynamics, based on observations of the system. Two case studies are presented for (1) a non-conservative pendulum and (2) a differential game dictating a two-car uncontrolled intersection scenario. In the paper we investigate how learning architectures can be manipulated for problem specific geometry. The result of this research provides that these problem specific models are valuable for accurate learning and predicting the dynamics of physics systems. In order to properly model the physics of a real pendulum, modifications were made to a prior architecture which was sufficient in modeling an ideal pendulum. The necessary modifications to the previous network [13] were problem specific and not transferrable to all other non-conservative physics scenarios. The modified architecture successfully models real pendulum dynamics. This case study provides a basis for future research in augmenting the symplectic gradient of a Hamiltonian energy function to provide a generalized, non-conservative physics model. A problem specific architecture was also utilized to create an accurate model for the two-car intersection case. The Costate Network proved to be an improvement from the previously used Value Network [17]. Note that this comparison is applied lightly due to slight implementation differences. The development of the Costate Network provides a basis for using characteristics to decompose functions and create a simplified learning problem. This paper is successful in creating new opportunities to develop physics models, in which the sample cases should be used as a guide for modeling other real and pseudo physics. Although the focused models in this paper are not generalizable, it is important to note that these cases provide direction for future research.

ContributorsMerry, Tanner (Author) / Ren, Yi (Thesis director) / Zhang, Wenlong (Committee member) / Mechanical and Aerospace Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

CPR complex pattern ranking for evaluating top-k pattern queries over event streams

Description

Most existing approaches to complex event processing over streaming data rely on the assumption that the matches to the queries are rare and that the goal of the system is to identify these few matches within the incoming deluge of data. In many applications, such as stock market analysis and…

Most existing approaches to complex event processing over streaming data rely on the assumption that the matches to the queries are rare and that the goal of the system is to identify these few matches within the incoming deluge of data. In many applications, such as stock market analysis and user credit card purchase pattern monitoring, however the matches to the user queries are in fact plentiful and the system has to efficiently sift through these many matches to locate only the few most preferable matches. In this work, we propose a complex pattern ranking (CPR) framework for specifying top-k pattern queries over streaming data, present new algorithms to support top-k pattern queries in data streaming environments, and verify the effectiveness and efficiency of the proposed algorithms. The developed algorithms identify top-k matching results satisfying both patterns as well as additional criteria. To support real-time processing of the data streams, instead of computing top-k results from scratch for each time window, we maintain top-k results dynamically as new events come and old ones expire. We also develop new top-k join execution strategies that are able to adapt to the changing situations (e.g., sorted and random access costs, join rates) without having to assume a priori presence of data statistics. Experiments show significant improvements over existing approaches.

ContributorsWang, Xinxin (Author) / Candan, K. Selcuk (Thesis advisor) / Chen, Yi (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2011

Characterization of carbonaceous aerosol over the north Atlantic Ocean

Description

Atmospheric particulate matter has a substantial impact on global climate due to its ability to absorb/scatter solar radiation and act as cloud condensation nuclei (CCN). Yet, little is known about marine aerosol, in particular, the carbonaceous fraction. In the present work, particulate matter was collected, using High Volume (HiVol) samplers,…

Atmospheric particulate matter has a substantial impact on global climate due to its ability to absorb/scatter solar radiation and act as cloud condensation nuclei (CCN). Yet, little is known about marine aerosol, in particular, the carbonaceous fraction. In the present work, particulate matter was collected, using High Volume (HiVol) samplers, onto quartz fiber substrates during a series of research cruises on the Atlantic Ocean. Samples were collected on board the R/V Endeavor on West–East (March–April, 2006) and East–West (June–July, 2006) transects in the North Atlantic, as well as on the R/V Polarstern during a North–South (October–November, 2005) transect along the western coast of Europe and Africa. The aerosol total carbon (TC) concentrations for the West–East (Narragansett, RI, USA to Nice, France) and East–West (Heraklion, Crete, Greece to Narragansett, RI, USA) transects were generally low over the open ocean (0.36±0.14 μg C/m3) and increased as the ship approached coastal areas (2.18±1.37 μg C/m3), due to increased terrestrial/anthropogenic aerosol inputs. The TC for the North–South transect samples decreased in the southern hemisphere with the exception of samples collected near the 15th parallel where calculations indicate the air mass back trajectories originated from the continent. Seasonal variation in organic carbon (OC) was seen in the northern hemisphere open ocean samples with average values of 0.45 μg/m3 and 0.26 μg/m3 for spring and summer, respectively. These low summer time values are consistent with SeaWiFS satellite images that show decreasing chlorophyll a concentration (a proxy for phytoplankton biomass) in the summer. There is also a statistically significant (p<0.05) decline in surface water fluorescence in the summer. Moreover, examination of water–soluble organic carbon (WSOC) shows that the summer aerosol samples appear to have a higher fraction of the lower molecular weight material, indicating that the samples may be more oxidized (aged). The seasonal variation in aerosol content seen during the two 2006 cruises is evidence that a primary biological marine source is a significant contributor to the carbonaceous particulate in the marine atmosphere and is consistent with previous studies of clean marine air masses.

ContributorsHill, Hansina Rae (Author) / Herckes, Pierre (Thesis advisor) / Westerhoff, Paul (Committee member) / Hartnett, Hilairy (Committee member) / Arizona State University (Publisher)

Created2011

Materialized views over heterogeneous structured data sources in a distributed event stream processing environment

Description

Data-driven applications are becoming increasingly complex with support for processing events and data streams in a loosely-coupled distributed environment, providing integrated access to heterogeneous data sources such as relational databases and XML documents. This dissertation explores the use of materialized views over structured heterogeneous data sources to support multiple query…

Data-driven applications are becoming increasingly complex with support for processing events and data streams in a loosely-coupled distributed environment, providing integrated access to heterogeneous data sources such as relational databases and XML documents. This dissertation explores the use of materialized views over structured heterogeneous data sources to support multiple query optimization in a distributed event stream processing framework that supports such applications involving various query expressions for detecting events, monitoring conditions, handling data streams, and querying data. Materialized views store the results of the computed view so that subsequent access to the view retrieves the materialized results, avoiding the cost of recomputing the entire view from base data sources. Using a service-based metadata repository that provides metadata level access to the various language components in the system, a heuristics-based algorithm detects the common subexpressions from the queries represented in a mixed multigraph model over relational and structured XML data sources. These common subexpressions can be relational, XML or a hybrid join over the heterogeneous data sources. This research examines the challenges in the definition and materialization of views when the heterogeneous data sources are retained in their native format, instead of converting the data to a common model. LINQ serves as the materialized view definition language for creating the view definitions. An algorithm is introduced that uses LINQ to create a data structure for the persistence of these hybrid views. Any changes to base data sources used to materialize views are captured and mapped to a delta structure. The deltas are then streamed within the framework for use in the incremental update of the materialized view. Algorithms are presented that use the magic sets query optimization approach to both efficiently materialize the views and to propagate the relevant changes to the views for incremental maintenance. Using representative scenarios over structured heterogeneous data sources, an evaluation of the framework demonstrates an improvement in performance. Thus, defining the LINQ-based materialized views over heterogeneous structured data sources using the detected common subexpressions and incrementally maintaining the views by using magic sets enhances the efficiency of the distributed event stream processing environment.

ContributorsChaudhari, Mahesh Balkrishna (Author) / Dietrich, Suzanne W (Thesis advisor) / Urban, Susan D (Committee member) / Davulcu, Hasan (Committee member) / Chen, Yi (Committee member) / Arizona State University (Publisher)

Created2011