Simultaneous segmentation and feature extraction approaches for silicon-pores sensor data are considered. Aggregating data into a matrix and performing low rank and sparse matrix decompositions with additional smoothness constraints are proposed to solve this problem. Comparison of several variants of the approaches and results for signal de-noising and translocation/trapping event extraction are presented. Algorithms to improve transform-domain features for ion-channel time-series signals based on matrix completion are presented. The improved features achieve better performance in classification tasks and in reducing the false alarm rates when applied to analyte detection.
Developing representations for multimedia is an important and challenging problem with applications ranging from scene recognition, multi-media retrieval and personal life-logging systems to field robot navigation. In this dissertation, we present a new framework for feature extraction for challenging natural environment sounds. Proposed features outperform traditional spectral features on challenging environmental sound datasets. Several algorithms are proposed that perform supervised tasks such as recognition and tag annotation. Ensemble methods are proposed to improve the tag annotation process.
To facilitate the use of large datasets, fast implementations are developed for sparse coding, the key component in our algorithms. Several strategies to speed-up Orthogonal Matching Pursuit algorithm using CUDA kernel on a GPU are proposed. Implementations are also developed for a large scale image retrieval system. Image-based "exact search" and "visually similar search" using the image patch sparse codes are performed. Results demonstrate large speed-up over CPU implementations and good retrieval performance is also achieved.
Addressing the important Last Level Cache (LLC) management problem in CMPs, I observe that LLC management decisions made in isolation, as in prior proposals, often lead to sub-optimal system performance. I demonstrate that in order to maximize system performance, it is essential to manage the LLCs while being cognizant of its interaction with the system main memory. I propose ReMAP, which reduces the net memory access cost by evicting cache lines that either have no reuse, or have low memory access cost. ReMAP improves the performance of the CMP system by as much as 13%, and by an average of 6.5%.
Rather than the LLC, the L1 data cache has a pronounced impact on GPGPU performance by acting as the bandwidth filter for the rest of the memory subsystem. Prior work has shown that the severely constrained data cache capacity in GPGPUs leads to sub-optimal performance. In this thesis, I propose two novel techniques that address the GPGPU data cache capacity problem. I propose ID-Cache that performs effective cache bypassing and cache line size selection to improve cache capacity utilization. Next, I propose LATTE-CC that considers the GPU’s latency tolerance feature and adaptively compresses the data stored in the data cache, thereby increasing its effective capacity. ID-Cache and LATTE-CC are shown to achieve 71% and 19.2% speedup, respectively, over a wide variety of GPGPU applications.
Complementing the aforementioned microarchitecture techniques, I identify the need for system architecture innovations to sustain performance scalability of GPG- PUs in the face of slowing Moore’s Law. I propose a novel GPU architecture called the Multi-Chip-Module GPU (MCM-GPU) that integrates multiple GPU modules to form a single logical GPU. With intelligent memory subsystem optimizations tailored for MCM-GPUs, it can achieve within 7% of the performance of a similar but hypothetical monolithic die GPU. Taking a step further, I present an in-depth study of the energy-efficiency characteristics of future MCM-GPUs. I demonstrate that the inherent non-uniform memory access side-effects form the key energy-efficiency bottleneck in the future.
In summary, this thesis offers key insights into the performance and energy-efficiency bottlenecks in CMPs and GPGPUs, which can guide future architects towards developing high-performance and energy-efficient general-purpose processors.
proach. There are several reasons for this purpose. The first reason is to establish
the basis of a GPU programming. To write programs that utilize GPU hardware,
CUDA or OpenCL is used which only support C and C++. FORTRAN has a feature
that lets its programs to call C/C++ functions. FORTRAN sends relevant data to
C/C++, which in turn sends that data to OpenCL. Although this approach works,
it makes the code messy and bulky and in the end more difficult to deal with. More-
over, there is a slight performance decrease from the additional data copy. This is
the motivation to have the code entirely written in C++ to make it more uniform,
efficient and clean. The second reason is the object oriented feature of the C++. The
“abstraction”, “inheritance” and “run-time polymorphism” features of C++ provide
some form of classes and objects, the ability to build new abstractions, and some
form of run-time binding, respectively. In recent years, some of popular codes has
been rewritten in C++ which were initially in FORTRAN. One of these softwares is
LAMMPS.
In this code the level set equation is solved by RLSG method to track the interface in
two phase flow. In gas/fluid flows, the surface tension is important and only exists at
the interface. Therefore, the location and some geometric features of interface need
to be evaluated which can be achieved by solving the level set equation.
With solution variables projected into a kth order polynomial basis, a k+1 order convergence rate is found for both advection and reinitialization tests using the method of manufactured solutions. Other standard test cases, such as Zalesak's disk and deformation of columns and spheres in periodic vortices are also performed, showing several orders of magnitude improvement over traditional WENO level set methods. These tests also show the impact of reinitialization, which often increases shape and volume errors as a result of level set scalar trapping by normal vectors calculated from the local level set field.
Accelerating advection via GPU hardware is found to provide a 30x speedup factor comparing a 2.0GHz Intel Xeon E5-2620 CPU in serial vs. a Nvidia Tesla K20 GPU, with speedup factors increasing with polynomial degree until shared memory is filled. A similar algorithm is implemented for reinitialization, which relies on heavier use of shared and global memory and as a result fills them more quickly and produces smaller speedups of 18x.
The majority of trust research has focused on the benefits trust can have for individual actors, institutions, and organizations. This “optimistic bias” is particularly evident in work focused on institutional trust, where concepts such as procedural justice, shared values, and moral responsibility have gained prominence. But trust in institutions may not be exclusively good. We reveal implications for the “dark side” of institutional trust by reviewing relevant theories and empirical research that can contribute to a more holistic understanding. We frame our discussion by suggesting there may be a “Goldilocks principle” of institutional trust, where trust that is too low (typically the focus) or too high (not usually considered by trust researchers) may be problematic. The chapter focuses on the issue of too-high trust and processes through which such too-high trust might emerge. Specifically, excessive trust might result from external, internal, and intersecting external-internal processes. External processes refer to the actions institutions take that affect public trust, while internal processes refer to intrapersonal factors affecting a trustor’s level of trust. We describe how the beneficial psychological and behavioral outcomes of trust can be mitigated or circumvented through these processes and highlight the implications of a “darkest” side of trust when they intersect. We draw upon research on organizations and legal, governmental, and political systems to demonstrate the dark side of trust in different contexts. The conclusion outlines directions for future research and encourages researchers to consider the ethical nuances of studying how to increase institutional trust.