Search Content

Efficient Java native interface for android based mobile devices

Description

Currently Java is making its way into the embedded systems and mobile devices like androids. The programs written in Java are compiled into machine independent binary class byte codes. A Java Virtual Machine (JVM) executes these classes. The Java platform additionally specifies the Java Native Interface (JNI). JNI allows Java…

Currently Java is making its way into the embedded systems and mobile devices like androids. The programs written in Java are compiled into machine independent binary class byte codes. A Java Virtual Machine (JVM) executes these classes. The Java platform additionally specifies the Java Native Interface (JNI). JNI allows Java code that runs within a JVM to interoperate with applications or libraries that are written in other languages and compiled to the host CPU ISA. JNI plays an important role in embedded system as it provides a mechanism to interact with libraries specific to the platform. This thesis addresses the overhead incurred in the JNI due to reflection and serialization when objects are accessed on android based mobile devices. It provides techniques to reduce this overhead. It also provides an API to access objects through its reference through pinning its memory location. The Android emulator was used to evaluate the performance of these techniques and we observed that there was 5 - 10 % performance gain in the new Java Native Interface.

ContributorsChandrian, Preetham (Author) / Lee, Yann-Hang (Thesis advisor) / Davulcu, Hasan (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2011

Enhancing the usability of complex structured data by supporting keyword searches

Description

As pointed out in the keynote speech by H. V. Jagadish in SIGMOD'07, and also commonly agreed in the database community, the usability of structured data by casual users is as important as the data management systems' functionalities. A major hardness of using structured data is the problem of easily…

As pointed out in the keynote speech by H. V. Jagadish in SIGMOD'07, and also commonly agreed in the database community, the usability of structured data by casual users is as important as the data management systems' functionalities. A major hardness of using structured data is the problem of easily retrieving information from them given a user's information needs. Learning and using a structured query language (e.g., SQL and XQuery) is overwhelmingly burdensome for most users, as not only are these languages sophisticated, but the users need to know the data schema. Keyword search provides us with opportunities to conveniently access structured data and potentially significantly enhances the usability of structured data. However, processing keyword search on structured data is challenging due to various types of ambiguities such as structural ambiguity (keyword queries have no structure), keyword ambiguity (the keywords may not be accurate), user preference ambiguity (the user may have implicit preferences that are not indicated in the query), as well as the efficiency challenges due to large search space. This dissertation performs an expansive study on keyword search processing techniques as a gateway for users to access structured data and retrieve desired information. The key issues addressed include: (1) Resolving structural ambiguities in keyword queries by generating meaningful query results, which involves identifying relevant keyword matches, identifying return information, composing query results based on relevant matches and return information. (2) Resolving structural, keyword and user preference ambiguities through result analysis, including snippet generation, result differentiation, result clustering, result summarization/query expansion, etc. (3) Resolving the efficiency challenge in processing keyword search on structured data by utilizing and efficiently maintaining materialized views. These works deliver significant technical contributions towards building a full-fledged search engine for structured data.

ContributorsLiu, Ziyang (Author) / Chen, Yi (Thesis advisor) / Candan, Kasim S (Committee member) / Davulcu, Hasan (Committee member) / Jagadish, H V (Committee member) / Arizona State University (Publisher)

Created2011

Association based prioritization of genes

Description

Genes have widely different pertinences to the etiology and pathology of diseases. Thus, they can be ranked according to their disease-significance on a genomic scale, which is the subject of gene prioritization. Given a set of genes known to be related to a disease, it is reasonable to use them…

Genes have widely different pertinences to the etiology and pathology of diseases. Thus, they can be ranked according to their disease-significance on a genomic scale, which is the subject of gene prioritization. Given a set of genes known to be related to a disease, it is reasonable to use them as a basis to determine the significance of other candidate genes, which will then be ranked based on the association they exhibit with respect to the given set of known genes. Experimental and computational data of various kinds have different reliability and relevance to a disease under study. This work presents a gene prioritization method based on integrated biological networks that incorporates and models the various levels of relevance and reliability of diverse sources. The method is shown to achieve significantly higher performance as compared to two well-known gene prioritization algorithms. Essentially, no bias in the performance was seen as it was applied to diseases of diverse ethnology, e.g., monogenic, polygenic and cancer. The method was highly stable and robust against significant levels of noise in the data. Biological networks are often sparse, which can impede the operation of associationbased gene prioritization algorithms such as the one presented here from a computational perspective. As a potential approach to overcome this limitation, we explore the value that transcription factor binding sites can have in elucidating suitable targets. Transcription factors are needed for the expression of most genes, especially in higher organisms and hence genes can be associated via their genetic regulatory properties. While each transcription factor recognizes specific DNA sequence patterns, such patterns are mostly unknown for many transcription factors. Even those that are known are inconsistently reported in the literature, implying a potentially high level of inaccuracy. We developed computational methods for prediction and improvement of transcription factor binding patterns. Tests performed on the improvement method by employing synthetic patterns under various conditions showed that the method is very robust and the patterns produced invariably converge to nearly identical series of patterns. Preliminary tests were conducted to incorporate knowledge from transcription factor binding sites into our networkbased model for prioritization, with encouraging results. Genes have widely different pertinences to the etiology and pathology of diseases. Thus, they can be ranked according to their disease-significance on a genomic scale, which is the subject of gene prioritization. Given a set of genes known to be related to a disease, it is reasonable to use them as a basis to determine the significance of other candidate genes, which will then be ranked based on the association they exhibit with respect to the given set of known genes. Experimental and computational data of various kinds have different reliability and relevance to a disease under study. This work presents a gene prioritization method based on integrated biological networks that incorporates and models the various levels of relevance and reliability of diverse sources. The method is shown to achieve significantly higher performance as compared to two well-known gene prioritization algorithms. Essentially, no bias in the performance was seen as it was applied to diseases of diverse ethnology, e.g., monogenic, polygenic and cancer. The method was highly stable and robust against significant levels of noise in the data. Biological networks are often sparse, which can impede the operation of associationbased gene prioritization algorithms such as the one presented here from a computational perspective. As a potential approach to overcome this limitation, we explore the value that transcription factor binding sites can have in elucidating suitable targets. Transcription factors are needed for the expression of most genes, especially in higher organisms and hence genes can be associated via their genetic regulatory properties. While each transcription factor recognizes specific DNA sequence patterns, such patterns are mostly unknown for many transcription factors. Even those that are known are inconsistently reported in the literature, implying a potentially high level of inaccuracy. We developed computational methods for prediction and improvement of transcription factor binding patterns. Tests performed on the improvement method by employing synthetic patterns under various conditions showed that the method is very robust and the patterns produced invariably converge to nearly identical series of patterns. Preliminary tests were conducted to incorporate knowledge from transcription factor binding sites into our networkbased model for prioritization, with encouraging results. To validate these approaches in a disease-specific context, we built a schizophreniaspecific network based on the inferred associations and performed a comprehensive prioritization of human genes with respect to the disease. These results are expected to be validated empirically, but computational validation using known targets are very positive.

ContributorsLee, Jang (Author) / Gonzalez, Graciela (Thesis advisor) / Ye, Jieping (Committee member) / Davulcu, Hasan (Committee member) / Gallitano-Mendel, Amelia (Committee member) / Arizona State University (Publisher)

Created2011

Optimal location and sizing of dynamic VArs for fast voltage collapse

Description

Recent changes in the energy markets structure combined with the conti-nuous load growth have caused power systems to be operated under more stressed conditions. In addition, the nature of power systems has also grown more complex and dynamic because of the increasing use of long inter-area tie-lines and the high…

Recent changes in the energy markets structure combined with the conti-nuous load growth have caused power systems to be operated under more stressed conditions. In addition, the nature of power systems has also grown more complex and dynamic because of the increasing use of long inter-area tie-lines and the high motor loads especially those comprised mainly of residential single phase A/C motors. Therefore, delayed voltage recovery, fast voltage collapse and short term voltage stability issues in general have obtained significant importance in relia-bility studies. Shunt VAr injection has been used as a countermeasure for voltage instability. However, the dynamic and fast nature of short term voltage instability requires fast and sufficient VAr injection, and therefore dynamic VAr devices such as Static VAr Compensators (SVCs) and STATic COMpensators (STAT-COMs) are used. The location and size of such devices are optimized in order to improve their efficiency and reduce initial costs. In this work time domain dy-namic analysis was used to evaluate trajectory voltage sensitivities for each time step. Linear programming was then performed to determine the optimal amount of required VAr injection at each bus, using voltage sensitivities as weighting factors. Optimal VAr injection values from different operating conditions were weighted and averaged in order to obtain a final setting of the VAr requirement. Some buses under consideration were either assigned very small VAr injection values, or not assigned any value at all. Therefore, the approach used in this work was found to be useful in not only determining the optimal size of SVCs, but also their location.

ContributorsSalloum, Ahmed (Author) / Vittal, Vijay (Thesis advisor) / Heydt, Gerald (Committee member) / Ayyanar, Raja (Committee member) / Arizona State University (Publisher)

Created2011

Design of data acquisition system and fault current limiter for an ultra fast protection system

Description

This research work describes the design of a fault current limiter (FCL) using digital logic and a microcontroller based data acquisition system for an ultra fast pilot protection system. These systems have been designed according to the requirements of the Future Renewable Electric Energy Delivery and Management (FREEDM) system (or…

This research work describes the design of a fault current limiter (FCL) using digital logic and a microcontroller based data acquisition system for an ultra fast pilot protection system. These systems have been designed according to the requirements of the Future Renewable Electric Energy Delivery and Management (FREEDM) system (or loop), a 1 MW green energy hub. The FREEDM loop merges advanced power electronics technology with information tech-nology to form an efficient power grid that can be integrated with the existing power system. With the addition of loads to the FREEDM system, the level of fault current rises because of increased energy flow to supply the loads, and this requires the design of a limiter which can limit this current to a level which the existing switchgear can interrupt. The FCL limits the fault current to around three times the rated current. Fast switching Insulated-gate bipolar transistor (IGBT) with its gate control logic implements a switching strategy which enables this operation. A complete simulation of the system was built on Simulink and it was verified that the FCL limits the fault current to 1000 A compared to more than 3000 A fault current in the non-existence of a FCL. This setting is made user-defined. In FREEDM system, there is a need to interrupt a fault faster or make intelligent deci-sions relating to fault events, to ensure maximum availability of power to the loads connected to the system. This necessitates fast acquisition of data which is performed by the designed data acquisition system. The microcontroller acquires the data from a current transformer (CT). Mea-surements are made at different points in the FREEDM system and merged together, to input it to the intelligent protection algorithm that has been developed by another student on the project. The algorithm will generate a tripping signal in the event of a fault. The developed hardware and the programmed software to accomplish data acquisition and transmission are presented here. The designed FCL ensures that the existing switchgear equipments need not be replaced thus aiding future power system expansion. The developed data acquisition system enables fast fault sensing in protection schemes improving its reliability.

ContributorsThirumalai, Arvind (Author) / Karady, George G. (Thesis advisor) / Vittal, Vijay (Committee member) / Hedman, Kory (Committee member) / Arizona State University (Publisher)

Created2011

CPR complex pattern ranking for evaluating top-k pattern queries over event streams

Description

Most existing approaches to complex event processing over streaming data rely on the assumption that the matches to the queries are rare and that the goal of the system is to identify these few matches within the incoming deluge of data. In many applications, such as stock market analysis and…

Most existing approaches to complex event processing over streaming data rely on the assumption that the matches to the queries are rare and that the goal of the system is to identify these few matches within the incoming deluge of data. In many applications, such as stock market analysis and user credit card purchase pattern monitoring, however the matches to the user queries are in fact plentiful and the system has to efficiently sift through these many matches to locate only the few most preferable matches. In this work, we propose a complex pattern ranking (CPR) framework for specifying top-k pattern queries over streaming data, present new algorithms to support top-k pattern queries in data streaming environments, and verify the effectiveness and efficiency of the proposed algorithms. The developed algorithms identify top-k matching results satisfying both patterns as well as additional criteria. To support real-time processing of the data streams, instead of computing top-k results from scratch for each time window, we maintain top-k results dynamically as new events come and old ones expire. We also develop new top-k join execution strategies that are able to adapt to the changing situations (e.g., sorted and random access costs, join rates) without having to assume a priori presence of data statistics. Experiments show significant improvements over existing approaches.

ContributorsWang, Xinxin (Author) / Candan, K. Selcuk (Thesis advisor) / Chen, Yi (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2011

Materialized views over heterogeneous structured data sources in a distributed event stream processing environment

Description

Data-driven applications are becoming increasingly complex with support for processing events and data streams in a loosely-coupled distributed environment, providing integrated access to heterogeneous data sources such as relational databases and XML documents. This dissertation explores the use of materialized views over structured heterogeneous data sources to support multiple query…

Data-driven applications are becoming increasingly complex with support for processing events and data streams in a loosely-coupled distributed environment, providing integrated access to heterogeneous data sources such as relational databases and XML documents. This dissertation explores the use of materialized views over structured heterogeneous data sources to support multiple query optimization in a distributed event stream processing framework that supports such applications involving various query expressions for detecting events, monitoring conditions, handling data streams, and querying data. Materialized views store the results of the computed view so that subsequent access to the view retrieves the materialized results, avoiding the cost of recomputing the entire view from base data sources. Using a service-based metadata repository that provides metadata level access to the various language components in the system, a heuristics-based algorithm detects the common subexpressions from the queries represented in a mixed multigraph model over relational and structured XML data sources. These common subexpressions can be relational, XML or a hybrid join over the heterogeneous data sources. This research examines the challenges in the definition and materialization of views when the heterogeneous data sources are retained in their native format, instead of converting the data to a common model. LINQ serves as the materialized view definition language for creating the view definitions. An algorithm is introduced that uses LINQ to create a data structure for the persistence of these hybrid views. Any changes to base data sources used to materialize views are captured and mapped to a delta structure. The deltas are then streamed within the framework for use in the incremental update of the materialized view. Algorithms are presented that use the magic sets query optimization approach to both efficiently materialize the views and to propagate the relevant changes to the views for incremental maintenance. Using representative scenarios over structured heterogeneous data sources, an evaluation of the framework demonstrates an improvement in performance. Thus, defining the LINQ-based materialized views over heterogeneous structured data sources using the detected common subexpressions and incrementally maintaining the views by using magic sets enhances the efficiency of the distributed event stream processing environment.

ContributorsChaudhari, Mahesh Balkrishna (Author) / Dietrich, Suzanne W (Thesis advisor) / Urban, Susan D (Committee member) / Davulcu, Hasan (Committee member) / Chen, Yi (Committee member) / Arizona State University (Publisher)

Created2011

Network topology optimization with alternating current optimal power flow

Description

The electric transmission grid is conventionally treated as a fixed asset and is operated around a single topology. Though several instances of switching transmission lines for corrective mechaism, congestion management, and minimization of losses can be found in literature, the idea of co-optimizing transmission with generation dispatch has not been…

The electric transmission grid is conventionally treated as a fixed asset and is operated around a single topology. Though several instances of switching transmission lines for corrective mechaism, congestion management, and minimization of losses can be found in literature, the idea of co-optimizing transmission with generation dispatch has not been widely investigated. Network topology optimization exploits the redundancies that are an integral part of the network to allow for improvement in dispatch efficiency. Although, the concept of a dispatchable network initially appears counterintuitive questioning the wisdom of switching transmission lines on a more regu-lar basis, results obtained in the previous research on transmission switching with a Direct Current Optimal Power Flow (DCOPF) show significant cost reductions. This thesis on network topology optimization with ACOPF emphasizes the need for additional research in this area. It examines the performance of network topology optimization in an Alternating Current (AC) setting and its impact on various parameters like active power loss and voltages that are ignored in the DC setting. An ACOPF model, with binary variables representing the status of transmission lines incorporated into the formulation, is written in AMPL, a mathematical programming language and this optimization problem is solved using the solver KNITRO. ACOPF is a non-convex, nonlinear optimization problem, making it a very hard problem to solve. The introduction of bi-nary variables makes ACOPF a mixed integer nonlinear programming problem, further increasing the complexity of the optimization problem. An iterative method of opening each transmission line individually before choosing the best solution has been proposed as a purely investigative approach to studying the impact of transmission switching with ACOPF. Economic savings of up to 6% achieved using this approach indicate the potential of this concept. In addition, a heuristic has been proposed to improve the computational efficiency of network topology optimization. This research also makes a comparative analysis between transmission switching in a DC setting and switching in an AC setting. Results presented in this thesis indicate significant economic savings achieved by controlled topology optimization, thereby reconfirming the need for further examination of this idea.

ContributorsPotluri, Tejaswi (Author) / Hedman, Kory (Thesis advisor) / Vittal, Vijay (Committee member) / Heydt, Gerald (Committee member) / Arizona State University (Publisher)

Created2011

Client-driven dynamic database updates

Description

This thesis addresses the problem of online schema updates where the goal is to be able to update relational database schemas without reducing the database system's availability. Unlike some other work in this area, this thesis presents an approach which is completely client-driven and does not require specialized database management…

This thesis addresses the problem of online schema updates where the goal is to be able to update relational database schemas without reducing the database system's availability. Unlike some other work in this area, this thesis presents an approach which is completely client-driven and does not require specialized database management systems (DBMS). Also, unlike other client-driven work, this approach provides support for a richer set of schema updates including vertical split (normalization), horizontal split, vertical and horizontal merge (union), difference and intersection. The update process automatically generates a runtime update client from a mapping between the old the new schemas. The solution has been validated by testing it on a relatively small database of around 300,000 records per table and less than 1 Gb, but with limited memory buffer size of 24 Mb. This thesis presents the study of the overhead of the update process as a function of the transaction rates and the batch size used to copy data from the old to the new schema. It shows that the overhead introduced is minimal for medium size applications and that the update can be achieved with no more than one minute of downtime.

ContributorsTyagi, Preetika (Author) / Bazzi, Rida (Thesis advisor) / Candan, Kasim S (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2011

Large scale analytical insights of email communication patterns

Description

This thesis research attempts to observe, measure and visualize the communication patterns among developers of an open source community and analyze how this can be inferred in terms of progress of that open source project. Here I attempted to analyze the Ubuntu open source project's email data (9 subproject log…

This thesis research attempts to observe, measure and visualize the communication patterns among developers of an open source community and analyze how this can be inferred in terms of progress of that open source project. Here I attempted to analyze the Ubuntu open source project's email data (9 subproject log archives over a period of five years) and focused on drawing more precise metrics from different perspectives of the communication data. Also, I attempted to overcome the scalability issue by using Apache Pig libraries, which run on a MapReduce framework based Hadoop Cluster. I described four metrics based on which I observed and analyzed the data and also presented the results which show the required patterns and anomalies to better understand and infer the communication. Also described the usage experience with Pig Latin (scripting language of Apache Pig Libraries) for this research and how they brought the feature of scalability, simplicity, and visibility in this data intensive research work. These approaches are useful in project monitoring, to augment human observation and reporting, in social network analysis, to track individual contributions.

ContributorsMotamarri, Lakshminarayana (Author) / Santanam, Raghu (Thesis advisor) / Ye, Jieping (Thesis advisor) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2011

Filtering by