Matching Items (21)

134018-Thumbnail Image.png

Portable and Low-Cost Detection Platform for Hepatitis B Virus Infections

Description

Approximately 248 million people in the world are currently living with chronic Hepatitis B virus (HBV) infection. HBV and HCV infections are the primary cause of liver diseases such as

Approximately 248 million people in the world are currently living with chronic Hepatitis B virus (HBV) infection. HBV and HCV infections are the primary cause of liver diseases such as cirrhosis and hepatocellular carcinomas in the world with an estimated 1.4 million deaths annually. HBV in the Republic of Peru was used as a case study of an emerging and rapidly spreading disease in a developing nation. Wherein, clinical diagnosis of HBV infections in at-risk communities such the Amazon Region and the Andes Mountains are challenging due to a myriad of reasons. High prices of clinical diagnosis and limited access to treatment are alone the most significant deterrent for individuals living in at-risk communities to get the much need help. Additionally, limited testing facilities, lack of adequate testing policies or national guidelines, poor laboratory capacity, resource-limited settings, geographical isolation, and public mistrust are among the chief reasons for low HBV testing. Although, preventative vaccination programs deployed by the Peruvian health officials have reduced the number of infected individuals by year and region. To significantly reduce or eradicate HBV in hyperendemic areas and countries such as Peru, preventative clinical diagnosis and vaccination programs are an absolute necessity. Consequently, the need for a portable low-priced diagnostic platform for the detection of HBV and other diseases is substantial and urgent not only in Peru but worldwide. Some of these concerns were addressed by designing a low-cost, rapid detection platform. In that, an immunosignature technology (IMST) slide used to test for reactivity against the presence of antibodies in the serum-sample was used to test for picture resolution and clarity. IMST slides were scanned using a smartphone camera placed on top of the designed device housing a circuit of 32 LED lights at 647 nm, an optical magnifier at 15X, and a linear polarizing film sheet. Tow 9V batteries powered the scanning device LED circuit ensuring enough lighting. The resulting pictures from the first prototype showed that by lighting the device at 647 nm and using a smartphone camera, the camera could capture high-resolution images. These results conclusively indicate that with any modern smartphone camera, a small box lighted to 647 nm, and optical magnifier; a powerful and expensive laboratory scanning machine can be replaced by another that is inexpensive, portable and ready to use anywhere.

Contributors

Agent

Created

Date Created
  • 2018-05

147587-Thumbnail Image.png

Data Representation for Predicting Harmonic Clusters with LSTM

Description

The purpose of this project is to create a useful tool for musicians that utilizes the harmonic content of their playing to recommend new, relevant chords to play. This is

The purpose of this project is to create a useful tool for musicians that utilizes the harmonic content of their playing to recommend new, relevant chords to play. This is done by training various Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNNs) on the lead sheets of 100 different jazz standards. A total of 200 unique datasets were produced and tested, resulting in the prediction of nearly 51 million chords. A note-prediction accuracy of 82.1% and a chord-prediction accuracy of 34.5% were achieved across all datasets. Methods of data representation that were rooted in valid music theory frameworks were found to increase the efficacy of harmonic prediction by up to 6%. Optimal LSTM input sizes were also determined for each method of data representation.

Contributors

Agent

Created

Date Created
  • 2021-05

131793-Thumbnail Image.png

Investigating Methods of Achieving Photorealistic Materials for Augmented Reality Applications on Mobile Devices

Description

As the prevalence of augmented reality (AR) technology continues to increase, so too have methods for improving the appearance and behavior of computer-generated objects. This is especially significant as

As the prevalence of augmented reality (AR) technology continues to increase, so too have methods for improving the appearance and behavior of computer-generated objects. This is especially significant as AR applications now expand to territories outside of the entertainment sphere and can be utilized for numerous purposes encompassing but not limited to education, specialized occupational training, retail & online shopping, design, marketing, and manufacturing. Due to the nature of AR technology, where computer-generated objects are being placed into a real-world environment, a decision has to be made regarding the visual connection between the tangible and the intangible. Should the objects blend seamlessly into their environment or purposefully stand out? It is not purely a stylistic choice. A developer must consider how their application will be used — in many instances an optimal user experience is facilitated by mimicking the real world as closely as possible; even simpler applications, such as those built primarily for mobile devices, can benefit from realistic AR. The struggle here lies in creating an immersive user experience that is not reliant on computationally-expensive graphics or heavy-duty models. The research contained in this thesis provides several ways for achieving photorealistic rendering in AR applications using a range of techniques, all of which are supported on mobile devices. These methods can be employed within the Unity Game Engine and incorporate shaders, render pipelines, node-based editors, post-processing, and light estimation.

Contributors

Agent

Created

Date Created
  • 2020-05

130884-Thumbnail Image.png

Thermal noise analysis of near-sensor image processing

Description

Commonly, image processing is handled on a CPU that is connected to the image sensor by a wire. In these far-sensor processing architectures, there is energy loss associated with sending

Commonly, image processing is handled on a CPU that is connected to the image sensor by a wire. In these far-sensor processing architectures, there is energy loss associated with sending data across an interconnect from the sensor to the CPU. In an effort to increase energy efficiency, near-sensor processing architectures have been developed, in which the sensor and processor are stacked directly on top of each other. This reduces energy loss associated with sending data off-sensor. However, processing near the image sensor causes the sensor to heat up. Reports of thermal noise in near-sensor processing architectures motivated us to study how temperature affects image quality on a commercial image sensor and how thermal noise affects computer vision task accuracy. We analyzed image noise across nine different temperatures and three sensor configurations to determine how image noise responds to an increase in temperature. Ultimately, our team used this information, along with transient analysis of a stacked image sensor’s thermal behavior, to advise thermal management strategies that leverage the benefits of near-sensor processing and prevent accuracy loss at problematic temperatures.

Contributors

Agent

Created

Date Created
  • 2020-12

158886-Thumbnail Image.png

A Scalable and Programmable I/O Controller for Region-based Computing

Description

I present my work on a scalable and programmable I/O controller for region-based computing, which will be used in a rhythmic pixel-based camera pipeline. I provide a breakdown of the

I present my work on a scalable and programmable I/O controller for region-based computing, which will be used in a rhythmic pixel-based camera pipeline. I provide a breakdown of the development and design of the I/O controller and how it fits in to rhythmic pixel regions, along with a studyon memory traffic of rhythmic pixel regions and how this translates to energy efficiency. This rhythmic pixel region-based camera pipeline has been jointly developed through Dr. Robert LiKamWa’s research lab. High spatiotemporal resolutions allow high precision for vision applications, such as for detecting features for augmented reality or face detection. High spatiotemporal resolution also comes with high memory throughput, leading to higher energy usage. This creates a tradeoff between high precision and energy efficiency, which becomes more important in mobile systems. In addition, not all pixels in a frame are necessary for the vision application, such as pixels that make up the background. Rhythmic pixel regions aim to reduce the tradeoff by creating a pipeline that allows an application developer to specify regions to capture at a non-uniform spatiotemporal resolution. This is accomplished by encoding the incoming image, and only sending the pixels within these specified regions. Later these encoded representations will be decoded to a standard frame representation usable by traditional vision applications. My contribution to this effort has been the design, testing and evaluation of the I/O controller.

Contributors

Agent

Created

Date Created
  • 2020

157215-Thumbnail Image.png

Adaptive Lighting for Data-Driven Non-Line-Of-Sight 3D Localization

Description

Non-line-of-sight (NLOS) imaging of objects not visible to either the camera or illumina-

tion source is a challenging task with vital applications including surveillance and robotics.

Recent NLOS reconstruction advances have been

Non-line-of-sight (NLOS) imaging of objects not visible to either the camera or illumina-

tion source is a challenging task with vital applications including surveillance and robotics.

Recent NLOS reconstruction advances have been achieved using time-resolved measure-

ments. Acquiring these time-resolved measurements requires expensive and specialized

detectors and laser sources. In work proposes a data-driven approach for NLOS 3D local-

ization requiring only a conventional camera and projector. The localisation is performed

using a voxelisation and a regression problem. Accuracy of greater than 90% is achieved

in localizing a NLOS object to a 5cm × 5cm × 5cm volume in real data. By adopting

the regression approach an object of width 10cm to localised to approximately 1.5cm. To

generalize to line-of-sight (LOS) scenes with non-planar surfaces, an adaptive lighting al-

gorithm is adopted. This algorithm, based on radiosity, identifies and illuminates scene

patches in the LOS which most contribute to the NLOS light paths, and can factor in sys-

tem power constraints. Improvements ranging from 6%-15% in accuracy with a non-planar

LOS wall using adaptive lighting is reported, demonstrating the advantage of combining

the physics of light transport with active illumination for data-driven NLOS imaging.

Contributors

Agent

Created

Date Created
  • 2019

156747-Thumbnail Image.png

Tree-Based Deep Mixture of Experts with Applications to Visual Saliency Prediction and Quality Robust Visual Recognition

Description

Mixture of experts is a machine learning ensemble approach that consists of individual models that are trained to be ``experts'' on subsets of the data, and a gating network that

Mixture of experts is a machine learning ensemble approach that consists of individual models that are trained to be ``experts'' on subsets of the data, and a gating network that provides weights to output a combination of the expert predictions. Mixture of experts models do not currently see wide use due to difficulty in training diverse experts and high computational requirements. This work presents modifications of the mixture of experts formulation that use domain knowledge to improve training, and incorporate parameter sharing among experts to reduce computational requirements.

First, this work presents an application of mixture of experts models for quality robust visual recognition. First it is shown that human subjects outperform deep neural networks on classification of distorted images, and then propose a model, MixQualNet, that is more robust to distortions. The proposed model consists of ``experts'' that are trained on a particular type of image distortion. The final output of the model is a weighted sum of the expert models, where the weights are determined by a separate gating network. The proposed model also incorporates weight sharing to reduce the number of parameters, as well as increase performance.

Second, an application of mixture of experts to predict visual saliency is presented. A computational saliency model attempts to predict where humans will look in an image. In the proposed model, each expert network is trained to predict saliency for a set of closely related images. The final saliency map is computed as a weighted mixture of the expert networks' outputs, with weights determined by a separate gating network. The proposed model achieves better performance than several other visual saliency models and a baseline non-mixture model.

Finally, this work introduces a saliency model that is a weighted mixture of models trained for different levels of saliency. Levels of saliency include high saliency, which corresponds to regions where almost all subjects look, and low saliency, which corresponds to regions where some, but not all subjects look. The weighted mixture shows improved performance compared with baseline models because of the diversity of the individual model predictions.

Contributors

Agent

Created

Date Created
  • 2018

157866-Thumbnail Image.png

Viewpoint Recommendation for Aesthetic Photography

Description

This thesis addresses the problem of recommending a viewpoint for aesthetic photography. Viewpoint recommendation is suggesting the best camera pose to capture a visually pleasing photograph of the subject of

This thesis addresses the problem of recommending a viewpoint for aesthetic photography. Viewpoint recommendation is suggesting the best camera pose to capture a visually pleasing photograph of the subject of interest by using any end-user device such as drone, mobile robot or smartphone. Solving this problem enables to capture visually pleasing photographs autonomously in areal photography, wildlife photography, landscape photography or in personal photography.

The viewpoint recommendation problem can be divided into two stages: (a) generating a set of dense novel views based on the basis views captured about the subject. The dense novel views are useful to better understand the scene and to know how the subject looks from different viewpoints and (b) each novel is scored based on how aesthetically good it is. The viewpoint with the greatest aesthetic score is recommended for capturing a visually pleasing photograph.

Contributors

Agent

Created

Date Created
  • 2019

157645-Thumbnail Image.png

Structured disentangling networks for learning deformation invariant latent spaces

Description

Disentangling latent spaces is an important research direction in the interpretability of unsupervised machine learning. Several recent works using deep learning are very effective at producing disentangled representations. However,

Disentangling latent spaces is an important research direction in the interpretability of unsupervised machine learning. Several recent works using deep learning are very effective at producing disentangled representations. However, in the unsupervised setting, there is no way to pre-specify which part of the latent space captures specific factors of variations. While this is generally a hard problem because of the non-existence of analytical expressions to capture these variations, there are certain factors like geometric

transforms that can be expressed analytically. Furthermore, in existing frameworks, the disentangled values are also not interpretable. The focus of this work is to disentangle these geometric factors of variations (which turn out to be nuisance factors for many applications) from the semantic content of the signal in an interpretable manner which in turn makes the features more discriminative. Experiments are designed to show the modularity of the approach with other disentangling strategies as well as on multiple one-dimensional (1D) and two-dimensional (2D) datasets, clearly indicating the efficacy of the proposed approach.

Contributors

Agent

Created

Date Created
  • 2019

157840-Thumbnail Image.png

Building Constraints, Geometric Invariants and Interpretability in Deep Learning: Applications in Computational Imaging and Vision

Description

Over the last decade, deep neural networks also known as deep learning, combined with large databases and specialized hardware for computation, have made major strides in important areas such as

Over the last decade, deep neural networks also known as deep learning, combined with large databases and specialized hardware for computation, have made major strides in important areas such as computer vision, computational imaging and natural language processing. However, such frameworks currently suffer from some drawbacks. For example, it is generally not clear how the architectures are to be designed for different applications, or how the neural networks behave under different input perturbations and it is not easy to make the internal representations and parameters more interpretable. In this dissertation, I propose building constraints into feature maps, parameters and and design of algorithms involving neural networks for applications in low-level vision problems such as compressive imaging and multi-spectral image fusion, and high-level inference problems including activity and face recognition. Depending on the application, such constraints can be used to design architectures which are invariant/robust to certain nuisance factors, more efficient and, in some cases, more interpretable. Through extensive experiments on real-world datasets, I demonstrate these advantages of the proposed methods over conventional frameworks.

Contributors

Agent

Created

Date Created
  • 2019