Matching Items (135)

Filtering by

Clear all filters

151802-Thumbnail Image.png

The classification of domain concepts in object-oriented systems

Description

The complexity of the systems that software engineers build has continuously grown since the inception of the field. What has not changed is the engineers' mental capacity to operate on about seven distinct pieces of information at a time. The

The complexity of the systems that software engineers build has continuously grown since the inception of the field. What has not changed is the engineers' mental capacity to operate on about seven distinct pieces of information at a time. The widespread use of UML has led to more abstract software design activities, however the same cannot be said for reverse engineering activities. The introduction of abstraction to reverse engineering will allow the engineer to move farther away from the details of the system, increasing his ability to see the role that domain level concepts play in the system. In this thesis, we present a technique that facilitates filtering of classes from existing systems at the source level based on their relationship to concepts in the domain via a classification method using machine learning. We showed that concepts can be identified using a machine learning classifier based on source level metrics. We developed an Eclipse plugin to assist with the process of manually classifying Java source code, and collecting metrics and classifications into a standard file format. We developed an Eclipse plugin to act as a concept identifier that visually indicates a class as a domain concept or not. We minimized the size of training sets to ensure a useful approach in practice. This allowed us to determine that a training set of 7:5 to 10% is nearly as effective as a training set representing 50% of the system. We showed that random selection is the most consistent and effective means of selecting a training set. We found that KNN is the most consistent performer among the learning algorithms tested. We determined the optimal feature set for this classification problem. We discussed two possible structures besides a one to one mapping of domain knowledge to implementation. We showed that classes representing more than one concept are simply concepts at differing levels of abstraction. We also discussed composite concepts representing a domain concept implemented by more than one class. We showed that these composite concepts are difficult to detect because the problem is NP-complete.

Contributors

Agent

Created

Date Created
2013

152337-Thumbnail Image.png

Study of an epidemic multiple behavior diffusion model in a resource constrained social network

Description

In contemporary society, sustainability and public well-being have been pressing challenges. Some of the important questions are:how can sustainable practices, such as reducing carbon emission, be encouraged? , How can a healthy lifestyle be maintained?Even though individuals are interested, they

In contemporary society, sustainability and public well-being have been pressing challenges. Some of the important questions are:how can sustainable practices, such as reducing carbon emission, be encouraged? , How can a healthy lifestyle be maintained?Even though individuals are interested, they are unable to adopt these behaviors due to resource constraints. Developing a framework to enable cooperative behavior adoption and to sustain it for a long period of time is a major challenge. As a part of developing this framework, I am focusing on methods to understand behavior diffusion over time. Facilitating behavior diffusion with resource constraints in a large population is qualitatively different from promoting cooperation in small groups. Previous work in social sciences has derived conditions for sustainable cooperative behavior in small homogeneous groups. However, how groups of individuals having resource constraint co-operate over extended periods of time is not well understood, and is the focus of my thesis. I develop models to analyze behavior diffusion over time through the lens of epidemic models with the condition that individuals have resource constraint. I introduce an epidemic model SVRS ( Susceptible-Volatile-Recovered-Susceptible) to accommodate multiple behavior adoption. I investigate the longitudinal effects of behavior diffusion by varying different properties of an individual such as resources,threshold and cost of behavior adoption. I also consider how behavior adoption of an individual varies with her knowledge of global adoption. I evaluate my models on several synthetic topologies like complete regular graph, preferential attachment and small-world and make some interesting observations. Periodic injection of early adopters can help in boosting the spread of behaviors and sustain it for a longer period of time. Also, behavior propagation for the classical epidemic model SIRS (Susceptible-Infected-Recovered-Susceptible) does not continue for an infinite period of time as per conventional wisdom. One interesting future direction is to investigate how behavior adoption is affected when number of individuals in a network changes. The affects on behavior adoption when availability of behavior changes with time can also be examined.

Contributors

Agent

Created

Date Created
2013

152100-Thumbnail Image.png

Decentralized information search

Description

Our research focuses on finding answers through decentralized search, for complex, imprecise queries (such as "Which is the best hair salon nearby?") in situations where there is a spatiotemporal constraint (say answer needs to be found within 15 minutes) associated

Our research focuses on finding answers through decentralized search, for complex, imprecise queries (such as "Which is the best hair salon nearby?") in situations where there is a spatiotemporal constraint (say answer needs to be found within 15 minutes) associated with the query. In general, human networks are good in answering imprecise queries. We try to use the social network of a person to answer his query. Our research aims at designing a framework that exploits the user's social network in order to maximize the answers for a given query. Exploiting an user's social network has several challenges. The major challenge is that the user's immediate social circle may not possess the answer for the given query, and hence the framework designed needs to carry out the query diffusion process across the network. The next challenge involves in finding the right set of seeds to pass the query to in the user's social circle. One other challenge is to incentivize people in the social network to respond to the query and thereby maximize the quality and quantity of replies. Our proposed framework is a mobile application where an individual can either respond to the query or forward it to his friends. We simulated the query diffusion process in three types of graphs: Small World, Random and Preferential Attachment. Given a type of network and a particular query, we carried out the query diffusion by selecting seeds based on attributes of the seed. The main attributes are Topic relevance, Replying or Forwarding probability and Time to Respond. We found that there is a considerable increase in the number of replies attained, even without saturating the user's network, if we adopt an optimal seed selection process. We found the output of the optimal algorithm to be satisfactory as the number of replies received at the interrogator's end was close to three times the number of neighbors an interrogator has. We addressed the challenge of incentivizing people to respond by associating a particular amount of points for each query asked, and awarding the same to people involved in answering the query. Thus, we aim to design a mobile application based on our proposed framework so that it helps in maximizing the replies for the interrogator's query by diffusing the query across his/her social network.

Contributors

Agent

Created

Date Created
2013

152236-Thumbnail Image.png

A cloud based continuous delivery software developing system on Vlab platform

Description

Continuous Delivery, as one of the youngest and most popular member of agile model family, has become a popular concept and method in software development industry recently. Instead of the traditional software development method, which requirements and solutions must be

Continuous Delivery, as one of the youngest and most popular member of agile model family, has become a popular concept and method in software development industry recently. Instead of the traditional software development method, which requirements and solutions must be fixed before starting software developing, it promotes adaptive planning, evolutionary development and delivery, and encourages rapid and flexible response to change. However, several problems prevent Continuous Delivery to be introduced into education world. Taking into the consideration of the barriers, we propose a new Cloud based Continuous Delivery Software Developing System. This system is designed to fully utilize the whole life circle of software developing according to Continuous Delivery concepts in a virtualized environment in Vlab platform.

Contributors

Agent

Created

Date Created
2013

152158-Thumbnail Image.png

Utility of considering multiple alternative rectifications in data cleaning

Description

Most data cleaning systems aim to go from a given deterministic dirty database to another deterministic but clean database. Such an enterprise pre–supposes that it is in fact possible for the cleaning process to uniquely recover the clean versions of

Most data cleaning systems aim to go from a given deterministic dirty database to another deterministic but clean database. Such an enterprise pre–supposes that it is in fact possible for the cleaning process to uniquely recover the clean versions of each dirty data tuple. This is not possible in many cases, where the most a cleaning system can do is to generate a (hopefully small) set of clean candidates for each dirty tuple. When the cleaning system is required to output a deterministic database, it is forced to pick one clean candidate (say the "most likely" candidate) per tuple. Such an approach can lead to loss of information. For example, consider a situation where there are three equally likely clean candidates of a dirty tuple. An appealing alternative that avoids such an information loss is to abandon the requirement that the output database be deterministic. In other words, even though the input (dirty) database is deterministic, I allow the reconstructed database to be probabilistic. Although such an approach does avoid the information loss, it also brings forth several challenges. For example, how many alternatives should be kept per tuple in the reconstructed database? Maintaining too many alternatives increases the size of the reconstructed database, and hence the query processing time. Second, while processing queries on the probabilistic database may well increase recall, how would they affect the precision of the query processing? In this thesis, I investigate these questions. My investigation is done in the context of a data cleaning system called BayesWipe that has the capability of producing multiple clean candidates per each dirty tuple, along with the probability that they are the correct cleaned version. I represent these alternatives as tuples in a tuple disjoint probabilistic database, and use the Mystiq system to process queries on it. This probabilistic reconstruction (called BayesWipe–PDB) is compared to a deterministic reconstruction (called BayesWipe–DET)—where the most likely clean candidate for each tuple is chosen, and the rest of the alternatives discarded.

Contributors

Agent

Created

Date Created
2013

152168-Thumbnail Image.png

An intelligent co-reference resolver for Winograd schema sentences containing resolved semantic entities

Description

There has been a lot of research in the field of artificial intelligence about thinking machines. Alan Turing proposed a test to observe a machine's intelligent behaviour with respect to natural language conversation. The Winograd schema challenge is suggested as

There has been a lot of research in the field of artificial intelligence about thinking machines. Alan Turing proposed a test to observe a machine's intelligent behaviour with respect to natural language conversation. The Winograd schema challenge is suggested as an alternative, to the Turing test. It needs inferencing capabilities, reasoning abilities and background knowledge to get the answer right. It involves a coreference resolution task in which a machine is given a sentence containing a situation which involves two entities, one pronoun and some more information about the situation and the machine has to come up with the right resolution of a pronoun to one of the entities. The complexity of the task is increased with the fact that the Winograd sentences are not constrained by one domain or specific sentence structure and it also contains a lot of human proper names. This modification makes the task of association of entities, to one particular word in the sentence, to derive the answer, difficult. I have developed a pronoun resolver system for the confined domain Winograd sentences. I have developed a classifier or filter which takes input sentences and decides to accept or reject them based on a particular criteria. Once the sentence is accepted. I run parsers on it to obtain the detailed analysis. Furthermore I have developed four answering modules which use world knowledge and inferencing mechanisms to try and resolve the pronoun. The four techniques I use are : ConceptNet knowledgebase, Search engine pattern counts,Narrative event chains and sentiment analysis. I have developed a particular aggregation mechanism for the answers from these modules to arrive at a final answer. I have used caching technique for the association relations that I obtain for different modules, so as to boost the performance. I run my system on the standard ‘nyu dataset’ of Winograd sentences and questions. This dataset is then restricted, by my classifier, to 90 sentences. I evaluate my system on this 90 sentence dataset. When I compare my results against the state of the art system on the same dataset, I get nearly 4.5 % improvement in the restricted domain.

Contributors

Agent

Created

Date Created
2013

152310-Thumbnail Image.png

We built this town: raising activity awareness through the workplace using gamification

Description

The wide adoption and continued advancement of information and communications technologies (ICT) have made it easier than ever for individuals and groups to stay connected over long distances. These advances have greatly contributed in dramatically changing the dynamics of the

The wide adoption and continued advancement of information and communications technologies (ICT) have made it easier than ever for individuals and groups to stay connected over long distances. These advances have greatly contributed in dramatically changing the dynamics of the modern day workplace to the point where it is now commonplace to see large, distributed multidisciplinary teams working together on a daily basis. However, in this environment, motivating, understanding, and valuing the diverse contributions of individual workers in collaborative enterprises becomes challenging. To address these issues, this thesis presents the goals, design, and implementation of Taskville, a distributed workplace game played by teams on large, public displays. Taskville uses a city building metaphor to represent the completion of individual and group tasks within an organization. Promising results from two usability studies and two longitudinal studies at a multidisciplinary school demonstrate that Taskville supports personal reflection and improves team awareness through an engaging workplace activity.

Contributors

Agent

Created

Date Created
2013

150599-Thumbnail Image.png

Somatic ABC's: a theoretical framework for designing, developing and evaluating the building blocks of touch-based information delivery

Description

Situations of sensory overload are steadily becoming more frequent as the ubiquity of technology approaches reality--particularly with the advent of socio-communicative smartphone applications, and pervasive, high speed wireless networks. Although the ease of accessing information has improved our communication effectiveness

Situations of sensory overload are steadily becoming more frequent as the ubiquity of technology approaches reality--particularly with the advent of socio-communicative smartphone applications, and pervasive, high speed wireless networks. Although the ease of accessing information has improved our communication effectiveness and efficiency, our visual and auditory modalities--those modalities that today's computerized devices and displays largely engage--have become overloaded, creating possibilities for distractions, delays and high cognitive load; which in turn can lead to a loss of situational awareness, increasing chances for life threatening situations such as texting while driving. Surprisingly, alternative modalities for information delivery have seen little exploration. Touch, in particular, is a promising candidate given that it is our largest sensory organ with impressive spatial and temporal acuity. Although some approaches have been proposed for touch-based information delivery, they are not without limitations including high learning curves, limited applicability and/or limited expression. This is largely due to the lack of a versatile, comprehensive design theory--specifically, a theory that addresses the design of touch-based building blocks for expandable, efficient, rich and robust touch languages that are easy to learn and use. Moreover, beyond design, there is a lack of implementation and evaluation theories for such languages. To overcome these limitations, a unified, theoretical framework, inspired by natural, spoken language, is proposed called Somatic ABC's for Articulating (designing), Building (developing) and Confirming (evaluating) touch-based languages. To evaluate the usefulness of Somatic ABC's, its design, implementation and evaluation theories were applied to create communication languages for two very unique application areas: audio described movies and motor learning. These applications were chosen as they presented opportunities for complementing communication by offloading information, typically conveyed visually and/or aurally, to the skin. For both studies, it was found that Somatic ABC's aided the design, development and evaluation of rich somatic languages with distinct and natural communication units.

Contributors

Agent

Created

Date Created
2012

149695-Thumbnail Image.png

Materialized views over heterogeneous structured data sources in a distributed event stream processing environment

Description

Data-driven applications are becoming increasingly complex with support for processing events and data streams in a loosely-coupled distributed environment, providing integrated access to heterogeneous data sources such as relational databases and XML documents. This dissertation explores the use of materialized

Data-driven applications are becoming increasingly complex with support for processing events and data streams in a loosely-coupled distributed environment, providing integrated access to heterogeneous data sources such as relational databases and XML documents. This dissertation explores the use of materialized views over structured heterogeneous data sources to support multiple query optimization in a distributed event stream processing framework that supports such applications involving various query expressions for detecting events, monitoring conditions, handling data streams, and querying data. Materialized views store the results of the computed view so that subsequent access to the view retrieves the materialized results, avoiding the cost of recomputing the entire view from base data sources. Using a service-based metadata repository that provides metadata level access to the various language components in the system, a heuristics-based algorithm detects the common subexpressions from the queries represented in a mixed multigraph model over relational and structured XML data sources. These common subexpressions can be relational, XML or a hybrid join over the heterogeneous data sources. This research examines the challenges in the definition and materialization of views when the heterogeneous data sources are retained in their native format, instead of converting the data to a common model. LINQ serves as the materialized view definition language for creating the view definitions. An algorithm is introduced that uses LINQ to create a data structure for the persistence of these hybrid views. Any changes to base data sources used to materialize views are captured and mapped to a delta structure. The deltas are then streamed within the framework for use in the incremental update of the materialized view. Algorithms are presented that use the magic sets query optimization approach to both efficiently materialize the views and to propagate the relevant changes to the views for incremental maintenance. Using representative scenarios over structured heterogeneous data sources, an evaluation of the framework demonstrates an improvement in performance. Thus, defining the LINQ-based materialized views over heterogeneous structured data sources using the detected common subexpressions and incrementally maintaining the views by using magic sets enhances the efficiency of the distributed event stream processing environment.

Contributors

Agent

Created

Date Created
2011

149518-Thumbnail Image.png

Collaboration of mobile and pervasive devices for embedded networked systems

Description

Embedded Networked Systems (ENS) consist of various devices, which are embedded into physical objects (e.g., home appliances, vehicles, buidlings, people). With rapid advances in processing and networking technologies, these devices can be fully connected and pervasive in the environment. The

Embedded Networked Systems (ENS) consist of various devices, which are embedded into physical objects (e.g., home appliances, vehicles, buidlings, people). With rapid advances in processing and networking technologies, these devices can be fully connected and pervasive in the environment. The devices can interact with the physical world, collaborate to share resources, and provide context-aware services. This dissertation focuses on collaboration in ENS to provide smart services. However, there are several challenges because the system must be - scalable to a huge number of devices; robust against noise, loss and failure; and secure despite communicating with strangers. To address these challenges, first, the dissertation focuses on designing a mobile gateway called Mobile Edge Computing Device (MECD) for Ubiquitous Sensor Networks (USN), a type of ENS. In order to reduce communication overhead with the server, an MECD is designed to provide local and distributed management of a network and data associated with a moving object (e.g., a person, car, pet). Furthermore, it supports collaboration with neighboring MECDs. The MECD is developed and tested for monitoring containers during shipment from Singapore to Taiwan and reachability to the remote server was a problem because of variance in connectivity (caused by high temperature variance) and high interference. The unreachability problem is addressed by using a mesh networking approach for collaboration of MECDs in sending data to a server. A hierarchical architecture is proposed in this regard to provide multi-level collaboration using dynamic mesh networks of MECDs at one layer. The mesh network is evaluated for an intelligent container scenario and results show complete connectivity with the server for temperature range from 25°C to 65°C. Finally, the authentication of mobile and pervasive devices in ENS for secure collaboration is investigated. This is a challenging problem because mutually unknown devices must be verified without knowledge of each other's identity. A self-organizing region-based authentication technique is proposed that uses environmental sound to autonomously verify if two devices are within the same region. The experimental results show sound could accurately authenticate devices within a small region.

Contributors

Agent

Created

Date Created
2010