Filtering by
- All Subjects: deep learning
- Creators: Berisha, Visar
- Member of: Theses and Dissertations
- Member of: Barrett, The Honors College Thesis/Creative Project Collection
We first consider sensor fusion, a typical multimodal fusion problem critical to building a pervasive computing platform. A systematic fusion technique is described to support both multiple sensors and descriptors for activity recognition. Targeted to learn the optimal combination of kernels, Multiple Kernel Learning (MKL) algorithms have been successfully applied to numerous fusion problems in computer vision etc. Utilizing the MKL formulation, next we describe an auto-context algorithm for learning image context via the fusion with low-level descriptors. Furthermore, a principled fusion algorithm using deep learning to optimize kernel machines is developed. By bridging deep architectures with kernel optimization, this approach leverages the benefits of both paradigms and is applied to a wide variety of fusion problems.
In many real-world applications, the modalities exhibit highly specific data structures, such as time sequences and graphs, and consequently, special design of the learning architecture is needed. In order to improve the temporal modeling for multivariate sequences, we developed two architectures centered around attention models. A novel clinical time series analysis model is proposed for several critical problems in healthcare. Another model coupled with triplet ranking loss as metric learning framework is described to better solve speaker diarization. Compared to state-of-the-art recurrent networks, these attention-based multivariate analysis tools achieve improved performance while having a lower computational complexity. Finally, in order to perform community detection on multilayer graphs, a fusion algorithm is described to derive node embedding from word embedding techniques and also exploit the complementary relational information contained in each layer of the graph.
To overcome these challenges, recent works have extensively investigated model compression techniques such as element-wise sparsity, structured sparsity and quantization. While most of these works have applied these compression techniques in isolation, there have been very few studies on application of quantization and structured sparsity together on a DNN model.
This thesis co-optimizes structured sparsity and quantization constraints on DNN models during training. Specifically, it obtains optimal setting of 2-bit weight and 2-bit activation coupled with 4X structured compression by performing combined exploration of quantization and structured compression settings. The optimal DNN model achieves 50X weight memory reduction compared to floating-point uncompressed DNN. This memory saving is significant since applying only structured sparsity constraints achieves 2X memory savings and only quantization constraints achieves 16X memory savings. The algorithm has been validated on both high and low capacity DNNs and on wide-sparse and deep-sparse DNN models. Experiments demonstrated that deep-sparse DNN outperforms shallow-dense DNN with varying level of memory savings depending on DNN precision and sparsity levels. This work further proposed a Pareto-optimal approach to systematically extract optimal DNN models from a huge set of sparse and dense DNN models. The resulting 11 optimal designs were further evaluated by considering overall DNN memory which includes activation memory and weight memory. It was found that there is only a small change in the memory footprint of the optimal designs corresponding to the low sparsity DNNs. However, activation memory cannot be ignored for high sparsity DNNs.
In this work, we first use capsule network for overlapping digit recognition problem. We evaluate the performance of the network with respect to recognition accuracy, convergence and training time per epoch. We show that capsule network achieves higher accuracy when training set size is small. When training set size is larger, capsule network and conventional CNN have comparable recognition accuracy. The training time per epoch for capsule network is longer than conventional CNN because of the dynamic routing algorithm. An analysis of the GPU timing shows that adjusting the capsule structure can help decrease the time complexity of the dynamic routing algorithm significantly.
Next, we design a capsule network for speech recognition, specifically, overlapping word recognition. We use both capsule network and conventional CNN to recognize 2 overlapping words in speech files created from 5 word classes. We show that capsule network achieves a considerably higher recognition accuracy (96.92%) compared to conventional CNN (85.19%). Our results show that capsule network recognizes overlapping word by recognizing each individual word in the speech. We also verify the scalability of capsule network by increasing the number of word classes from 5 to 10. Capsule network still shows a high recognition accuracy of 95.42% in case of 10 words while the accuracy of conventional CNN decreases sharply to 73.18%.
Recovery of CS video frames is far more complex than still images, which are known to be (approximately) sparse in a linear basis such as the discrete cosine transform. By combining sparsity of individual frames with an optical flow-based model of inter-frame dependence, the perceptual quality and peak signal to noise ratio (PSNR) of reconstructed frames is improved. The efficacy of this approach is demonstrated for the cases of \textit{a priori} known image motion and unknown but constant image-wide motion.
Although video sequences can be reconstructed from CS measurements, the process is computationally costly. In autonomous systems, this reconstruction step is unnecessary if higher-level conclusions can be drawn directly from the CS data. A tracking algorithm is described and evaluated which can hold target vehicles at very high levels of compression where reconstruction of video frames fails. The algorithm performs tracking by detection using a particle filter with likelihood given by a maximum average correlation height (MACH) target template model.
Motivated by possible improvements over the MACH filter-based likelihood estimation of the tracking algorithm, the application of deep learning models to detection and classification of compressively sensed images is explored. In tests, a Deep Boltzmann Machine trained on CS measurements outperforms a naive reconstruct-first approach.
Taken together, progress in these three areas of CS inference has the potential to lower system cost and improve performance, opening up new applications of CS video cameras.
Purpose: This qualitative research aimed to create a developmentally and gender-appropriate game-based intervention to promote Human Papillomavirus (HPV) vaccination in adolescents. <br/>Background: Ranking as the most common sexually transmitted infection, about 80 million Americans are currently infected by HPV, and it continues to increase with an estimated 14 million new cases yearly. Certain types of HPV have been significantly associated with cervical, vaginal, and vulvar cancers in women; penile cancers in men; and oropharyngeal and anal cancers in both men and women. Despite HPV vaccination being one of the most effective methods in preventing HPV-associated cancers, vaccination rates remain suboptimal in adolescents. Game-based intervention, a novel medium that is popular with adolescents, has been shown to be effective in promoting health behaviors. <br/>Methods: Sample/Sampling. We used purposeful sampling to recruit eight adolescent-parent dyads (N = 16) which represented both sexes (4 boys, 4 girls) and different racial/ethnic groups (White, Black, Latino, Asian American) in the United States. The inclusion criteria for the dyads were: (1) a child aged 11-14 years and his/her parent, and (2) ability to speak, read, write, and understand English. Procedure. After eligible families consented to their participation, semi-structured interviews (each 60-90 minutes long) were conducted with each adolescent-parent dyad in a quiet and private room. Each dyad received $50 to acknowledge their time and effort. Measure. The interview questions consisted of two parts: (a) those related to game design, functioning, and feasibility of implementation; (b) those related to theoretical constructs of the Health Belief Model (HBM) and the Theory of Planned Behavior (TPB). Data analysis. The interviews were audio-recorded with permission and manually transcribed into textual data. Two researchers confirmed the verbatim transcription. We use pre-developed codes to identify each participant’s responses and organize data and develop themes based on the HBM and TPB constructs. After the analysis was completed, three researchers in the team reviewed the results and discussed the discrepancies until a consensus is reached.<br/>Results: The findings suggested that the most common motivating factors for adolescents’ HPV vaccination were its effectiveness, benefits, convenience, affordable cost, reminders via text, and recommendation by a health care provider. Regarding the content included in the HPV game, participants suggested including information about who and when should receive the vaccine, what is HPV and the vaccination, what are the consequences if infected, the side effects of the vaccine, and where to receive the vaccine. The preferred game design elements were: 15 minutes long, stories about fighting or action, option to choose characters/avatars, motivating factors (i.e., rewards such as allowing users to advance levels and receive coins when correctly answering questions), use of a portable electronic device (e.g., tablet) to deliver the education. Participants were open to multiplayer function which assists in a facilitated conversation about HPV and the HPV vaccine. Overall, the participants concluded enthusiasm for an interactive yet engaging game-based intervention to learn about the HPV vaccine with the goal to increase HPV vaccination in adolescents. <br/>Implications: Tailored educational games have the potential to decrease the stigma of HPV and HPV vaccination, increasing communication between the adolescent, parent, and healthcare provider, as well as increase the overall HPV vaccination rate.
This paper serves to report the research performed towards detecting PD and the effects of medication through the use of machine learning and finger tapping data collected through mobile devices. The primary objective for this research is to prototype a PD classification model and a medication classification model that predict the following: the individual’s disease status and the medication intake time relative to performing the finger-tapping activity, respectively.
This paper serves to report the research performed towards detecting PD and the effects of medication through the use of machine learning and finger tapping data collected through mobile devices. The primary objective for this research is to prototype a PD classification model and a medication classification model that predict the following: the individual’s disease status and the medication intake time relative to performing the finger-tapping activity, respectively.