Matching Items (72)

Diversity Promoting Online Sampling for Streaming Video Summarization

Description

Video summarization is gaining popularity in the technological culture, where positioning the mouse pointer on top of a video results in a quick overview of what the video is about.

Video summarization is gaining popularity in the technological culture, where positioning the mouse pointer on top of a video results in a quick overview of what the video is about. The algorithm usually selects frames in a time sequence through systematic sampling. Invariably, there are other applications like video surveillance, web-based video surfing and video archival applications which can benefit from efficient and concise video summaries. In this project, we explored several clustering algorithms and how these can be combined and deconstructed to make summarization algorithm more efficient and relevant. We focused on two metrics to summarize: reducing error and redundancy in the summary. To reduce the error online k-means clustering algorithm was used; to reduce redundancy we applied two different methods: volume of convex hulls and the true diversity measure that is usually used in biological disciplines. The algorithm was efficient and computationally cost effective due to its online nature. The diversity maximization (or redundancy reduction) using technique of volume of convex hulls showed better results compared to other conventional methods on 50 different videos. For the true diversity measure, there has not been much work done on the nature of the measure in the context of video summarization. When we applied it, the algorithm stalled due to the true diversity saturating because of the inherent initialization present in the algorithm. We explored the nature of this measure to gain better understanding on how it can help to make summarization more intuitive and give the user a handle to customize the summary.

Contributors

Agent

Created

Date Created
  • 2017-05

131537-Thumbnail Image.png

Topological Descriptors for Parkinson's Disease Classification and Regression Analysis

Description

At present, the vast majority of human subjects with neurological disease are still diagnosed through in-person assessments and qualitative analysis of patient data. In this paper, we propose to use

At present, the vast majority of human subjects with neurological disease are still diagnosed through in-person assessments and qualitative analysis of patient data. In this paper, we propose to use Topological Data Analysis (TDA) together with machine learning tools to automate the process of Parkinson’s disease classification and severity assessment. An automated, stable, and accurate method to evaluate Parkinson’s would be significant in streamlining diagnoses of patients and providing families more time for corrective measures. We propose a methodology which incorporates TDA into analyzing Parkinson’s disease postural shifts data through the representation of persistence images. Studying the topology of a system has proven to be invariant to small changes in data and has been shown to perform well in discrimination tasks. The contributions of the paper are twofold. We propose a method to 1) classify healthy patients from those afflicted by disease and 2) diagnose the severity of disease. We explore the use of the proposed method in an application involving a Parkinson’s disease dataset comprised of healthy-elderly, healthy-young and Parkinson’s disease patients.

Contributors

Agent

Created

Date Created
  • 2020-05

Wearable Device Activity Classification With Machine Learning and a Custom Web Application

Description

Human activity recognition is the task of identifying a person’s movement from sensors in a wearable device, such as a smartphone, smartwatch, or a medical-grade device. A great method

Human activity recognition is the task of identifying a person’s movement from sensors in a wearable device, such as a smartphone, smartwatch, or a medical-grade device. A great method for this task is machine learning, which is the study of algorithms that learn and improve on their own with the help of massive amounts of useful data. These classification models can accurately classify activities with the time-series data from accelerometers and gyroscopes. A significant way to improve the accuracy of these machine learning models is preprocessing the data, essentially augmenting data to make the identification of each activity, or class, easier for the model. <br/>On this topic, this paper explains the design of SigNorm, a new web application which lets users conveniently transform time-series data and view the effects of those transformations in a code-free, browser-based user interface. The second and final section explains my take on a human activity recognition problem, which involves comparing a preprocessed dataset to an un-augmented one, and comparing the differences in accuracy using a one-dimensional convolutional neural network to make classifications.

Contributors

Agent

Created

Date Created
  • 2021-05

158890-Thumbnail Image.png

Nurturing Open Design: Challenges and Opportunities for HCI to Support Crowd-driven Hardware Design

Description

Open Design is a crowd-driven global ecosystem which tries to challenge and alter contemporary modes of capitalistic hardware production. It strives to build on the collective skills, expertise and efforts

Open Design is a crowd-driven global ecosystem which tries to challenge and alter contemporary modes of capitalistic hardware production. It strives to build on the collective skills, expertise and efforts of people regardless of their educational, social or political backgrounds to develop and disseminate physical products, machines and systems. In contrast to capitalistic hardware production, Open Design practitioners publicly share design files, blueprints and knowhow through various channels including internet platforms and in-person workshops. These designs are typically replicated, modified, improved and reshared by individuals and groups who are broadly referred to as ‘makers’.

This dissertation aims to expand the current scope of Open Design within human-computer interaction (HCI) research through a long-term exploration of Open Design’s socio-technical processes. I examine Open Design from three perspectives: the functional—materials, tools, and platforms that enable crowd-driven open hardware production, the critical—materially-oriented engagements within open design as a site for sociotechnical discourse, and the speculative—crowd-driven critical envisioning of future hardware.

More specifically, this dissertation first explores the growing global scene of Open Design through a long-term ethnographic study of the open science hardware (OScH) movement, a genre of Open Design. This long-term study of OScH provides a focal point for HCI to deeply understand Open Design's growing global landscape. Second, it examines the application of Critical Making within Open Design through an OScH workshop with designers, engineers, artists and makers from local communities. This work foregrounds the role of HCI researchers as facilitators of collaborative critical engagements within Open Design. Third, this dissertation introduces the concept of crowd-driven Design Fiction through the development of a publicly accessible online Design Fiction platform named Dream Drones. Through a six month long development and a study with drone related practitioners, it offers several pragmatic insights into the challenges and opportunities for crowd-driven Design Fiction. Through these explorations, I highlight the broader implications and novel research pathways for HCI to shape and be shaped by the global Open Design movement.

Contributors

Agent

Created

Date Created
  • 2020

158817-Thumbnail Image.png

Building Invariant, Robust And Stable Machine Learning Systems Using Geometry and Topology

Description

Over the past decade, machine learning research has made great strides and significant impact in several fields. Its success is greatly attributed to the development of effective machine learning algorithms

Over the past decade, machine learning research has made great strides and significant impact in several fields. Its success is greatly attributed to the development of effective machine learning algorithms like deep neural networks (a.k.a. deep learning), availability of large-scale databases and access to specialized hardware like Graphic Processing Units. When designing and training machine learning systems, researchers often assume access to large quantities of data that capture different possible variations. Variations in the data is needed to incorporate desired invariance and robustness properties in the machine learning system, especially in the case of deep learning algorithms. However, it is very difficult to gather such data in a real-world setting. For example, in certain medical/healthcare applications, it is very challenging to have access to data from all possible scenarios or with the necessary amount of variations as required to train the system. Additionally, the over-parameterized and unconstrained nature of deep neural networks can cause them to be poorly trained and in many cases over-confident which, in turn, can hamper their reliability and generalizability. This dissertation is a compendium of my research efforts to address the above challenges. I propose building invariant feature representations by wedding concepts from topological data analysis and Riemannian geometry, that automatically incorporate the desired invariance properties for different computer vision applications. I discuss how deep learning can be used to address some of the common challenges faced when working with topological data analysis methods. I describe alternative learning strategies based on unsupervised learning and transfer learning to address issues like dataset shifts and limited training data. Finally, I discuss my preliminary work on applying simple orthogonal constraints on deep learning feature representations to help develop more reliable and better calibrated models.

Contributors

Agent

Created

Date Created
  • 2020

152100-Thumbnail Image.png

Decentralized information search

Description

Our research focuses on finding answers through decentralized search, for complex, imprecise queries (such as "Which is the best hair salon nearby?") in situations where there is a spatiotemporal constraint

Our research focuses on finding answers through decentralized search, for complex, imprecise queries (such as "Which is the best hair salon nearby?") in situations where there is a spatiotemporal constraint (say answer needs to be found within 15 minutes) associated with the query. In general, human networks are good in answering imprecise queries. We try to use the social network of a person to answer his query. Our research aims at designing a framework that exploits the user's social network in order to maximize the answers for a given query. Exploiting an user's social network has several challenges. The major challenge is that the user's immediate social circle may not possess the answer for the given query, and hence the framework designed needs to carry out the query diffusion process across the network. The next challenge involves in finding the right set of seeds to pass the query to in the user's social circle. One other challenge is to incentivize people in the social network to respond to the query and thereby maximize the quality and quantity of replies. Our proposed framework is a mobile application where an individual can either respond to the query or forward it to his friends. We simulated the query diffusion process in three types of graphs: Small World, Random and Preferential Attachment. Given a type of network and a particular query, we carried out the query diffusion by selecting seeds based on attributes of the seed. The main attributes are Topic relevance, Replying or Forwarding probability and Time to Respond. We found that there is a considerable increase in the number of replies attained, even without saturating the user's network, if we adopt an optimal seed selection process. We found the output of the optimal algorithm to be satisfactory as the number of replies received at the interrogator's end was close to three times the number of neighbors an interrogator has. We addressed the challenge of incentivizing people to respond by associating a particular amount of points for each query asked, and awarding the same to people involved in answering the query. Thus, we aim to design a mobile application based on our proposed framework so that it helps in maximizing the replies for the interrogator's query by diffusing the query across his/her social network.

Contributors

Agent

Created

Date Created
  • 2013

152122-Thumbnail Image.png

Exploring video denoising using matrix completion

Description

Video denoising has been an important task in many multimedia and computer vision applications. Recent developments in the matrix completion theory and emergence of new numerical methods which can efficiently

Video denoising has been an important task in many multimedia and computer vision applications. Recent developments in the matrix completion theory and emergence of new numerical methods which can efficiently solve the matrix completion problem have paved the way for exploration of new techniques for some classical image processing tasks. Recent literature shows that many computer vision and image processing problems can be solved by using the matrix completion theory. This thesis explores the application of matrix completion in video denoising. A state-of-the-art video denoising algorithm in which the denoising task is modeled as a matrix completion problem is chosen for detailed study. The contribution of this thesis lies in both providing extensive analysis to bridge the gap in existing literature on matrix completion frame work for video denoising and also in proposing some novel techniques to improve the performance of the chosen denoising algorithm. The chosen algorithm is implemented for thorough analysis. Experiments and discussions are presented to enable better understanding of the problem. Instability shown by the algorithm at some parameter values in a particular case of low levels of pure Gaussian noise is identified. Artifacts introduced in such cases are analyzed. A novel way of grouping structurally-relevant patches is proposed to improve the algorithm. Experiments show that this technique is useful, especially in videos containing high amounts of motion. Based on the observation that matrix completion is not suitable for denoising patches containing relatively low amount of image details, a framework is designed to separate patches corresponding to low structured regions from a noisy image. Experiments are conducted by not subjecting such patches to matrix completion, instead denoising such patches in a different way. The resulting improvement in performance suggests that denoising low structured patches does not require a complex method like matrix completion and in fact it is counter-productive to subject such patches to matrix completion. These results also indicate the inherent limitation of matrix completion to deal with cases in which noise dominates the structural properties of an image. A novel method for introducing priorities to the ranked patches in matrix completion is also presented. Results showed that this method yields improved performance in general. It is observed that the artifacts in presence of low levels of pure Gaussian noise appear differently after introducing priorities to the patches and the artifacts occur at a wider range of parameter values. Results and discussion suggesting future ways to explore this problem are also presented.

Contributors

Agent

Created

Date Created
  • 2013

156919-Thumbnail Image.png

Advances in Motion Estimators for Applications in Computer Vision

Description

Motion estimation is a core task in computer vision and many applications utilize optical flow methods as fundamental tools to analyze motion in images and videos. Optical flow is the

Motion estimation is a core task in computer vision and many applications utilize optical flow methods as fundamental tools to analyze motion in images and videos. Optical flow is the apparent motion of objects in image sequences that results from relative motion between the objects and the imaging perspective. Today, optical flow fields are utilized to solve problems in various areas such as object detection and tracking, interpolation, visual odometry, etc. In this dissertation, three problems from different areas of computer vision and the solutions that make use of modified optical flow methods are explained.

The contributions of this dissertation are approaches and frameworks that introduce i) a new optical flow-based interpolation method to achieve minimally divergent velocimetry data, ii) a framework that improves the accuracy of change detection algorithms in synthetic aperture radar (SAR) images, and iii) a set of new methods to integrate Proton Magnetic Resonance Spectroscopy (1HMRSI) data into threedimensional (3D) neuronavigation systems for tumor biopsies.

In the first application an optical flow-based approach for the interpolation of minimally divergent velocimetry data is proposed. The velocimetry data of incompressible fluids contain signals that describe the flow velocity. The approach uses the additional flow velocity information to guide the interpolation process towards reduced divergence in the interpolated data.

In the second application a framework that mainly consists of optical flow methods and other image processing and computer vision techniques to improve object extraction from synthetic aperture radar images is proposed. The proposed framework is used for distinguishing between actual motion and detected motion due to misregistration in SAR image sets and it can lead to more accurate and meaningful change detection and improve object extraction from a SAR datasets.

In the third application a set of new methods that aim to improve upon the current state-of-the-art in neuronavigation through the use of detailed three-dimensional (3D) 1H-MRSI data are proposed. The result is a progressive form of online MRSI-guided neuronavigation that is demonstrated through phantom validation and clinical application.

Contributors

Agent

Created

Date Created
  • 2018

157215-Thumbnail Image.png

Adaptive Lighting for Data-Driven Non-Line-Of-Sight 3D Localization

Description

Non-line-of-sight (NLOS) imaging of objects not visible to either the camera or illumina-

tion source is a challenging task with vital applications including surveillance and robotics.

Recent NLOS reconstruction advances have been

Non-line-of-sight (NLOS) imaging of objects not visible to either the camera or illumina-

tion source is a challenging task with vital applications including surveillance and robotics.

Recent NLOS reconstruction advances have been achieved using time-resolved measure-

ments. Acquiring these time-resolved measurements requires expensive and specialized

detectors and laser sources. In work proposes a data-driven approach for NLOS 3D local-

ization requiring only a conventional camera and projector. The localisation is performed

using a voxelisation and a regression problem. Accuracy of greater than 90% is achieved

in localizing a NLOS object to a 5cm × 5cm × 5cm volume in real data. By adopting

the regression approach an object of width 10cm to localised to approximately 1.5cm. To

generalize to line-of-sight (LOS) scenes with non-planar surfaces, an adaptive lighting al-

gorithm is adopted. This algorithm, based on radiosity, identifies and illuminates scene

patches in the LOS which most contribute to the NLOS light paths, and can factor in sys-

tem power constraints. Improvements ranging from 6%-15% in accuracy with a non-planar

LOS wall using adaptive lighting is reported, demonstrating the advantage of combining

the physics of light transport with active illumination for data-driven NLOS imaging.

Contributors

Agent

Created

Date Created
  • 2019

DeepCrashTest: Translating Dashcam Videos to Virtual Tests forAutomated Driving Systems

Description

The autonomous vehicle technology has come a long way, but currently, there are no companies that are able to offer fully autonomous ride in any conditions, on any road without

The autonomous vehicle technology has come a long way, but currently, there are no companies that are able to offer fully autonomous ride in any conditions, on any road without any human supervision. These systems should be extensively trained and validated to guarantee safe human transportation. Any small errors in the system functionality may lead to fatal accidents and may endanger human lives. Deep learning methods are widely used for environment perception and prediction of hazardous situations. These techniques require huge amount of training data with both normal and abnormal samples to enable the vehicle to avoid a dangerous situation.

The goal of this thesis is to generate simulations from real-world tricky collision scenarios for training and testing autonomous vehicles. Dashcam crash videos from the internet can now be utilized to extract valuable collision data and recreate the crash scenarios in a simulator. The problem of extracting 3D vehicle trajectories from videos recorded by an unknown monocular camera source is solved using a modular approach. The framework is divided into two stages: (a) extracting meaningful adversarial trajectories from short crash videos, and (b) developing methods to automatically process and simulate the vehicle trajectories on a vehicle simulator.

Contributors

Agent

Created

Date Created
  • 2019