Matching Items (986)

Filtering by

Clear all filters

Robust margin based classifiers for small sample data

Description

In many classication problems data samples cannot be collected easily, example in drug trials, biological experiments and study on cancer patients. In many situations the data set size is small and there are many outliers. When classifying such data, example

In many classication problems data samples cannot be collected easily, example in drug trials, biological experiments and study on cancer patients. In many situations the data set size is small and there are many outliers. When classifying such data, example cancer vs normal patients the consequences of mis-classication are probably more important than any other data type, because the data point could be a cancer patient or the classication decision could help determine what gene might be over expressed and perhaps a cause of cancer. These mis-classications are typically higher in the presence of outlier data points. The aim of this thesis is to develop a maximum margin classier that is suited to address the lack of robustness of discriminant based classiers (like the Support Vector Machine (SVM)) to noise and outliers. The underlying notion is to adopt and develop a natural loss function that is more robust to outliers and more representative of the true loss function of the data. It is demonstrated experimentally that SVM's are indeed susceptible to outliers and that the new classier developed, here coined as Robust-SVM (RSVM), is superior to all studied classier on the synthetic datasets. It is superior to the SVM in both the synthetic and experimental data from biomedical studies and is competent to a classier derived on similar lines when real life data examples are considered.

Contributors

Agent

Created

Date Created
2011

149817-Thumbnail Image.png

Characterization of carbonaceous aerosol over the north Atlantic Ocean

Description

Atmospheric particulate matter has a substantial impact on global climate due to its ability to absorb/scatter solar radiation and act as cloud condensation nuclei (CCN). Yet, little is known about marine aerosol, in particular, the carbonaceous fraction. In the present

Atmospheric particulate matter has a substantial impact on global climate due to its ability to absorb/scatter solar radiation and act as cloud condensation nuclei (CCN). Yet, little is known about marine aerosol, in particular, the carbonaceous fraction. In the present work, particulate matter was collected, using High Volume (HiVol) samplers, onto quartz fiber substrates during a series of research cruises on the Atlantic Ocean. Samples were collected on board the R/V Endeavor on West–East (March–April, 2006) and East–West (June–July, 2006) transects in the North Atlantic, as well as on the R/V Polarstern during a North–South (October–November, 2005) transect along the western coast of Europe and Africa. The aerosol total carbon (TC) concentrations for the West–East (Narragansett, RI, USA to Nice, France) and East–West (Heraklion, Crete, Greece to Narragansett, RI, USA) transects were generally low over the open ocean (0.36±0.14 μg C/m3) and increased as the ship approached coastal areas (2.18±1.37 μg C/m3), due to increased terrestrial/anthropogenic aerosol inputs. The TC for the North–South transect samples decreased in the southern hemisphere with the exception of samples collected near the 15th parallel where calculations indicate the air mass back trajectories originated from the continent. Seasonal variation in organic carbon (OC) was seen in the northern hemisphere open ocean samples with average values of 0.45 μg/m3 and 0.26 μg/m3 for spring and summer, respectively. These low summer time values are consistent with SeaWiFS satellite images that show decreasing chlorophyll a concentration (a proxy for phytoplankton biomass) in the summer. There is also a statistically significant (p<0.05) decline in surface water fluorescence in the summer. Moreover, examination of water–soluble organic carbon (WSOC) shows that the summer aerosol samples appear to have a higher fraction of the lower molecular weight material, indicating that the samples may be more oxidized (aged). The seasonal variation in aerosol content seen during the two 2006 cruises is evidence that a primary biological marine source is a significant contributor to the carbonaceous particulate in the marine atmosphere and is consistent with previous studies of clean marine air masses.

Contributors

Agent

Created

Date Created
2011

149501-Thumbnail Image.png

Detecting sybil nodes in static and dynamic networks

Description

Peer-to-peer systems are known to be vulnerable to the Sybil attack. The lack of a central authority allows a malicious user to create many fake identities (called Sybil nodes) pretending to be independent honest nodes. The goal of the malicious

Peer-to-peer systems are known to be vulnerable to the Sybil attack. The lack of a central authority allows a malicious user to create many fake identities (called Sybil nodes) pretending to be independent honest nodes. The goal of the malicious user is to influence the system on his/her behalf. In order to detect the Sybil nodes and prevent the attack, a reputation system is used for the nodes, built through observing its interactions with its peers. The construction makes every node a part of a distributed authority that keeps records on the reputation and behavior of the nodes. Records of interactions between nodes are broadcast by the interacting nodes and honest reporting proves to be a Nash Equilibrium for correct (non-Sybil) nodes. In this research is argued that in realistic communication schedule scenarios, simple graph-theoretic queries such as the computation of Strongly Connected Components and Densest Subgraphs, help in exposing those nodes most likely to be Sybil, which are then proved to be Sybil or not through a direct test executed by some peers.

Contributors

Agent

Created

Date Created
2010

149503-Thumbnail Image.png

Stereo based visual odometry

Description

The exponential rise in unmanned aerial vehicles has necessitated the need for accurate pose estimation under any extreme conditions. Visual Odometry (VO) is the estimation of position and orientation of a vehicle based on analysis of a sequence of images

The exponential rise in unmanned aerial vehicles has necessitated the need for accurate pose estimation under any extreme conditions. Visual Odometry (VO) is the estimation of position and orientation of a vehicle based on analysis of a sequence of images captured from a camera mounted on it. VO offers a cheap and relatively accurate alternative to conventional odometry techniques like wheel odometry, inertial measurement systems and global positioning system (GPS). This thesis implements and analyzes the performance of a two camera based VO called Stereo based visual odometry (SVO) in presence of various deterrent factors like shadows, extremely bright outdoors, wet conditions etc... To allow the implementation of VO on any generic vehicle, a discussion on porting of the VO algorithm to android handsets is presented too. The SVO is implemented in three steps. In the first step, a dense disparity map for a scene is computed. To achieve this we utilize sum of absolute differences technique for stereo matching on rectified and pre-filtered stereo frames. Epipolar geometry is used to simplify the matching problem. The second step involves feature detection and temporal matching. Feature detection is carried out by Harris corner detector. These features are matched between two consecutive frames using the Lucas-Kanade feature tracker. The 3D co-ordinates of these matched set of features are computed from the disparity map obtained from the first step and are mapped into each other by a translation and a rotation. The rotation and translation is computed using least squares minimization with the aid of Singular Value Decomposition. Random Sample Consensus (RANSAC) is used for outlier detection. This comprises the third step. The accuracy of the algorithm is quantified based on the final position error, which is the difference between the final position computed by the SVO algorithm and the final ground truth position as obtained from the GPS. The SVO showed an error of around 1% under normal conditions for a path length of 60 m and around 3% in bright conditions for a path length of 130 m. The algorithm suffered in presence of shadows and vibrations, with errors of around 15% and path lengths of 20 m and 100 m respectively.

Contributors

Agent

Created

Date Created
2010

149695-Thumbnail Image.png

Materialized views over heterogeneous structured data sources in a distributed event stream processing environment

Description

Data-driven applications are becoming increasingly complex with support for processing events and data streams in a loosely-coupled distributed environment, providing integrated access to heterogeneous data sources such as relational databases and XML documents. This dissertation explores the use of materialized

Data-driven applications are becoming increasingly complex with support for processing events and data streams in a loosely-coupled distributed environment, providing integrated access to heterogeneous data sources such as relational databases and XML documents. This dissertation explores the use of materialized views over structured heterogeneous data sources to support multiple query optimization in a distributed event stream processing framework that supports such applications involving various query expressions for detecting events, monitoring conditions, handling data streams, and querying data. Materialized views store the results of the computed view so that subsequent access to the view retrieves the materialized results, avoiding the cost of recomputing the entire view from base data sources. Using a service-based metadata repository that provides metadata level access to the various language components in the system, a heuristics-based algorithm detects the common subexpressions from the queries represented in a mixed multigraph model over relational and structured XML data sources. These common subexpressions can be relational, XML or a hybrid join over the heterogeneous data sources. This research examines the challenges in the definition and materialization of views when the heterogeneous data sources are retained in their native format, instead of converting the data to a common model. LINQ serves as the materialized view definition language for creating the view definitions. An algorithm is introduced that uses LINQ to create a data structure for the persistence of these hybrid views. Any changes to base data sources used to materialize views are captured and mapped to a delta structure. The deltas are then streamed within the framework for use in the incremental update of the materialized view. Algorithms are presented that use the magic sets query optimization approach to both efficiently materialize the views and to propagate the relevant changes to the views for incremental maintenance. Using representative scenarios over structured heterogeneous data sources, an evaluation of the framework demonstrates an improvement in performance. Thus, defining the LINQ-based materialized views over heterogeneous structured data sources using the detected common subexpressions and incrementally maintaining the views by using magic sets enhances the efficiency of the distributed event stream processing environment.

Contributors

Agent

Created

Date Created
2011

149699-Thumbnail Image.png

Synthesis and evaluation of a new class of cancer chemotherapeutics based on purine-like extended amidines

Description

A potential new class of cancer chemotherapeutic agents has been synthesized by varying the 2 position of a benzimidazole based extended amidine. Compounds 6-amino-2-chloromethyl-4-imino-1-(2-methansulfonoxyethyl)-5-methyl-1H-benzimidazole-7-one (1A) and 6-amino-2-hydroxypropyl-4-imino-1-(2-methansulfonoxyethyl)-5-methyl-1H-benzimidazole-7-one (1B) were assayed at the National Cancer Institute's (NCI) Developmental Therapeutic Program (DTP)

A potential new class of cancer chemotherapeutic agents has been synthesized by varying the 2 position of a benzimidazole based extended amidine. Compounds 6-amino-2-chloromethyl-4-imino-1-(2-methansulfonoxyethyl)-5-methyl-1H-benzimidazole-7-one (1A) and 6-amino-2-hydroxypropyl-4-imino-1-(2-methansulfonoxyethyl)-5-methyl-1H-benzimidazole-7-one (1B) were assayed at the National Cancer Institute's (NCI) Developmental Therapeutic Program (DTP) and found to be cytotoxic at sub-micromolar concentrations, and have shown between a 100 and a 1000-fold increase in specificity towards lung, colon, CNS, and melanoma cell lines. These ATP mimics have been found to correlate with sequestosome 1 (SQSTM1), a protein implicated in drug resistance and cell survival in various cancer cell lines. Using the DTP COMPARE algorithm, compounds 1A and 1B were shown to correlate to each other at 77%, but failed to correlate with other benzimidazole based extended amidines previously synthesized in this laboratory suggesting they operate through a different biological mechanism.

Contributors

Agent

Created

Date Created
2011

149703-Thumbnail Image.png

Adaptive decentralized routing and detection of overlapping communities

Description

This dissertation studies routing in small-world networks such as grids plus long-range edges and real networks. Kleinberg showed that geography-based greedy routing in a grid-based network takes an expected number of steps polylogarithmic in the network size, thus justifying empirical

This dissertation studies routing in small-world networks such as grids plus long-range edges and real networks. Kleinberg showed that geography-based greedy routing in a grid-based network takes an expected number of steps polylogarithmic in the network size, thus justifying empirical efficiency observed beginning with Milgram. A counterpart for the grid-based model is provided; it creates all edges deterministically and shows an asymptotically matching upper bound on the route length. The main goal is to improve greedy routing through a decentralized machine learning process. Two considered methods are based on weighted majority and an algorithm of de Farias and Megiddo, both learning from feedback using ensembles of experts. Tests are run on both artificial and real networks, with decentralized spectral graph embedding supplying geometric information for real networks where it is not intrinsically available. An important measure analyzed in this work is overpayment, the difference between the cost of the method and that of the shortest path. Adaptive routing overtakes greedy after about a hundred or fewer searches per node, consistently across different network sizes and types. Learning stabilizes, typically at overpayment of a third to a half of that by greedy. The problem is made more difficult by eliminating the knowledge of neighbors' locations or by introducing uncooperative nodes. Even under these conditions, the learned routes are usually better than the greedy routes. The second part of the dissertation is related to the community structure of unannotated networks. A modularity-based algorithm of Newman is extended to work with overlapping communities (including considerably overlapping communities), where each node locally makes decisions to which potential communities it belongs. To measure quality of a cover of overlapping communities, a notion of a node contribution to modularity is introduced, and subsequently the notion of modularity is extended from partitions to covers. The final part considers a problem of network anonymization, mostly by the means of edge deletion. The point of interest is utility preservation. It is shown that a concentration on the preservation of routing abilities might damage the preservation of community structure, and vice versa.

Contributors

Agent

Created

Date Created
2011

149714-Thumbnail Image.png

Analyzing the dynamics of communication in online social networks

Description

This thesis deals with the analysis of interpersonal communication dynamics in online social networks and social media. Our central hypothesis is that communication dynamics between individuals manifest themselves via three key aspects: the information that is the content of communication,

This thesis deals with the analysis of interpersonal communication dynamics in online social networks and social media. Our central hypothesis is that communication dynamics between individuals manifest themselves via three key aspects: the information that is the content of communication, the social engagement i.e. the sociological framework emergent of the communication process, and the channel i.e. the media via which communication takes place. Communication dynamics have been of interest to researchers from multi-faceted domains over the past several decades. However, today we are faced with several modern capabilities encompassing a host of social media websites. These sites feature variegated interactional affordances, ranging from blogging, micro-blogging, sharing media elements as well as a rich set of social actions such as tagging, voting, commenting and so on. Consequently, these communication tools have begun to redefine the ways in which we exchange information, our modes of social engagement, and mechanisms of how the media characteristics impact our interactional behavior. The outcomes of this research are manifold. We present our contributions in three parts, corresponding to the three key organizing ideas. First, we have observed that user context is key to characterizing communication between a pair of individuals. However interestingly, the probability of future communication seems to be more sensitive to the context compared to the delay, which appears to be rather habitual. Further, we observe that diffusion of social actions in a network can be indicative of future information cascades; that might be attributed to social influence or homophily depending on the nature of the social action. Second, we have observed that different modes of social engagement lead to evolution of groups that have considerable predictive capability in characterizing external-world temporal occurrences, such as stock market dynamics as well as collective political sentiments. Finally, characterization of communication on rich media sites have shown that conversations that are deemed "interesting" appear to have consequential impact on the properties of the social network they are associated with: in terms of degree of participation of the individuals in future conversations, thematic diffusion as well as emergent cohesiveness in activity among the concerned participants in the network. Based on all these outcomes, we believe that this research can make significant contribution into a better understanding of how we communicate online and how it is redefining our collective sociological behavior.

Contributors

Agent

Created

Date Created
2011

149744-Thumbnail Image.png

Smooth surfaces for video game development

Description

The video game graphics pipeline has traditionally rendered the scene using a polygonal approach. Advances in modern graphics hardware now allow the rendering of parametric methods. This thesis explores various smooth surface rendering methods that can be integrated into the

The video game graphics pipeline has traditionally rendered the scene using a polygonal approach. Advances in modern graphics hardware now allow the rendering of parametric methods. This thesis explores various smooth surface rendering methods that can be integrated into the video game graphics engine. Moving over to parametric or smooth surfaces from the polygonal domain has its share of issues and there is an inherent need to address various rendering bottlenecks that could hamper such a move. The game engine needs to choose an appropriate method based on in-game characteristics of the objects; character and animated objects need more sophisticated methods whereas static objects could use simpler techniques. Scaling the polygon count over various hardware platforms becomes an important factor. Much control is needed over the tessellation levels, either imposed by the hardware limitations or by the application, to be able to adaptively render the mesh without significant loss in performance. This thesis explores several methods that would help game engine developers in making correct design choices by optimally balancing the trade-offs while rendering the scene using smooth surfaces. It proposes a novel technique for adaptive tessellation of triangular meshes that vastly improves speed and tessellation count. It develops an approximate method for rendering Loop subdivision surfaces on tessellation enabled hardware. A taxonomy and evaluation of the methods is provided and a unified rendering system that provides automatic level of detail by switching between the methods is proposed.

Contributors

Agent

Created

Date Created
2011

149827-Thumbnail Image.png

Timing and structural control of gold mineralization, Santa Gertrudis, Sonora, Mexico

Description

The Santa Gertrudis Mining District of Sonora, Mexico contains more than a dozen purported Carlin-like, sedimentary-hosted, disseminated-gold deposits. A series of near-surface, mostly oxidized gold deposits were open-pit mined from the calcareous and clastic units of the Cretaceous Bisbee Group.

The Santa Gertrudis Mining District of Sonora, Mexico contains more than a dozen purported Carlin-like, sedimentary-hosted, disseminated-gold deposits. A series of near-surface, mostly oxidized gold deposits were open-pit mined from the calcareous and clastic units of the Cretaceous Bisbee Group. Gold occurs as finely disseminated, sub-micron coatings on sulfides, associated with argillization and silicification of calcareous, carbonaceous, and siliciclastic sedimentary rocks in structural settings. Gold occurs with elevated levels of As, Hg, Sb, Pb, and Zn. Downhole drill data within distal disseminated gold zones reveal a 5:1 ratio of Ag:Au and strong correlations of Au to Pb and Zn. This study explores the timing and structural control of mineralization utilizing field mapping, geochemical studies, drilling, core logging, and structural analysis. Most field evidence indicates that mineralization is related to a single pulse of moderately differentiated, Eocene intrusives described as Mo-Cu-Au skarn with structurally controlled distal disseminated As-Ag-Au.

Contributors

Agent

Created

Date Created
2011