Search Content

Wearable Device Activity Classification With Machine Learning and a Custom Web Application

Description

Human activity recognition is the task of identifying a person’s movement from sensors in a wearable device, such as a smartphone, smartwatch, or a medical-grade device. A great method for this task is machine learning, which is the study of algorithms that learn and improve on their own with…

Human activity recognition is the task of identifying a person’s movement from sensors in a wearable device, such as a smartphone, smartwatch, or a medical-grade device. A great method for this task is machine learning, which is the study of algorithms that learn and improve on their own with the help of massive amounts of useful data. These classification models can accurately classify activities with the time-series data from accelerometers and gyroscopes. A significant way to improve the accuracy of these machine learning models is preprocessing the data, essentially augmenting data to make the identification of each activity, or class, easier for the model. <br/>On this topic, this paper explains the design of SigNorm, a new web application which lets users conveniently transform time-series data and view the effects of those transformations in a code-free, browser-based user interface. The second and final section explains my take on a human activity recognition problem, which involves comparing a preprocessed dataset to an un-augmented one, and comparing the differences in accuracy using a one-dimensional convolutional neural network to make classifications.

ContributorsLi, Vincent (Author) / Turaga, Pavan (Thesis director) / Buman, Matthew (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

Differences in intelligibility of vocoded speech for adult Mandarin and English speakers

Description

The ability of cochlear implants (CI) to restore auditory function has advanced significantly in the past decade. Approximately 96,000 people in the United States benefit from these devices, which by the generation and transmission of electrical impulses, enable the brain to perceive sound. But due to the predominantly Western cochlear…

The ability of cochlear implants (CI) to restore auditory function has advanced significantly in the past decade. Approximately 96,000 people in the United States benefit from these devices, which by the generation and transmission of electrical impulses, enable the brain to perceive sound. But due to the predominantly Western cochlear implant market, current CI characterization primarily focuses on improving the quality of American English. Only recently has research begun to evaluate CI performance using other languages such as Mandarin Chinese, which rely on distinct spectral characteristics not present in English. Mandarin, a tonal language utilizes four, distinct pitch patterns, which when voiced a syllable, conveys different meanings for the same word. This presents a challenge to hearing research as spectral, or frequency based information like pitch is readily acknowledged to be significantly reduced by CI processing algorithms. Thus the present study sought to identify the intelligibility differences for English and Mandarin when processed using current CI strategies. The objective of the study was to pinpoint any notable discrepancies in speech recognition, using voice-coded (vocoded) audio that simulates a CI generated stimuli. This approach allowed 12 normal hearing English speakers, and 9 normal hearing Mandarin listeners to participate in the experiment. The number of frequency channels available and the carrier type of excitation were varied in order to compare their effects on two cases of Mandarin intelligibility: Case 1) word recognition and Case 2) combined word and tone recognition. The results indicated a statistically significant difference between English and Mandarin intelligibility for Condition 1 (8Ch-Sinewave Carrier, p=0.022) given Case 1 and Condition 1 (8Ch-Sinewave Carrier, p=0.001) and Condition 3 (16Ch-Sinewave Carrier, p=0.001) given Case 2. The data suggests that the nature of the carrier type does have an effect on tonal language intelligibility and warrants further research as a design consideration for future cochlear implants.

ContributorsSchiltz, Jessica Hammitt (Author) / Berisha, Visar (Thesis director) / Frakes, David (Committee member) / Barrett, The Honors College (Contributor) / Harrington Bioengineering Program (Contributor)

Created2015-05

Topological Descriptors for Parkinson's Disease Classification and Regression Analysis

Description

At present, the vast majority of human subjects with neurological disease are still diagnosed through in-person assessments and qualitative analysis of patient data. In this paper, we propose to use Topological Data Analysis (TDA) together with machine learning tools to automate the process of Parkinson’s disease classification and severity assessment.…

At present, the vast majority of human subjects with neurological disease are still diagnosed through in-person assessments and qualitative analysis of patient data. In this paper, we propose to use Topological Data Analysis (TDA) together with machine learning tools to automate the process of Parkinson’s disease classification and severity assessment. An automated, stable, and accurate method to evaluate Parkinson’s would be significant in streamlining diagnoses of patients and providing families more time for corrective measures. We propose a methodology which incorporates TDA into analyzing Parkinson’s disease postural shifts data through the representation of persistence images. Studying the topology of a system has proven to be invariant to small changes in data and has been shown to perform well in discrimination tasks. The contributions of the paper are twofold. We propose a method to 1) classify healthy patients from those afflicted by disease and 2) diagnose the severity of disease. We explore the use of the proposed method in an application involving a Parkinson’s disease dataset comprised of healthy-elderly, healthy-young and Parkinson’s disease patients.

ContributorsRahman, Farhan Nadir (Co-author) / Nawar, Afra (Co-author) / Turaga, Pavan (Thesis director) / Krishnamurthi, Narayanan (Committee member) / Electrical Engineering Program (Contributor) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2020-05

Let's Talk Monkey- Quantitative Analysis of Marmoset Monkey Calls

Description

The marmoset monkey (Callithrix jacchus) is a new-world primate species native to South America rainforests. Because they rely on vocal communication to navigate and survive, marmosets have evolved as a promising primate model to study vocal production, perception, cognition, and social interactions. The purpose of this project is to provide…

The marmoset monkey (Callithrix jacchus) is a new-world primate species native to South America rainforests. Because they rely on vocal communication to navigate and survive, marmosets have evolved as a promising primate model to study vocal production, perception, cognition, and social interactions. The purpose of this project is to provide an initial assessment on the vocal repertoire of a marmoset colony raised at Arizona State University and call types they use in different social conditions. The vocal production of a colony of 16 marmoset monkeys was recorded in 3 different conditions with three repeats of each condition. The positive condition involves a caretaker distributing food, the negative condition involves an experimenter taking a marmoset out of his cage to a different room, and the control condition is the normal state of the colony with no human interference. A total of 5396 samples of calls were collected during a total of 256 minutes of audio recordings. Call types were analyzed in semi-automated computer programs developed in the Laboratory of Auditory Computation and Neurophysiology. A total of 5 major call types were identified and their variants in different social conditions were analyzed. The results showed that the total number of calls and the type of calls made differed in the three social conditions, suggesting that monkey vocalization signals and depends on the social context.

ContributorsFernandez, Jessmin Natalie (Author) / Zhou, Yi (Thesis director) / Berisha, Visar (Committee member) / School of International Letters and Cultures (Contributor) / Department of Psychology (Contributor) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

An Algorithm for the Automatic Detection of Vocal Flutter

Description

Detecting early signs of neurodegeneration is vital for measuring the efficacy of pharmaceuticals and planning treatments for neurological diseases. This is especially true for Amyotrophic Lateral Sclerosis (ALS) where differences in symptom onset can be indicative of the prognosis. Because it can be measured noninvasively, changes in speech production have…

Detecting early signs of neurodegeneration is vital for measuring the efficacy of pharmaceuticals and planning treatments for neurological diseases. This is especially true for Amyotrophic Lateral Sclerosis (ALS) where differences in symptom onset can be indicative of the prognosis. Because it can be measured noninvasively, changes in speech production have been proposed as a promising indicator of neurological decline. However, speech changes are typically measured subjectively by a clinician. These perceptual ratings can vary widely between clinicians and within the same clinician on different patient visits, making clinical ratings less sensitive to subtle early indicators. In this paper, we propose an algorithm for the objective measurement of flutter, a quasi-sinusoidal modulation of fundamental frequency that manifests in the speech of some ALS patients. The algorithm detailed in this paper employs long-term average spectral analysis on the residual F0 track of a sustained phonation to detect the presence of flutter and is robust to longitudinal drifts in F0. The algorithm is evaluated on a longitudinal speech dataset of ALS patients at varying stages in their prognosis. Benchmarking with two stages of perceptual ratings provided by an expert speech pathologist indicate that the algorithm follows perceptual ratings with moderate accuracy and can objectively detect flutter in instances where the variability of the perceptual rating causes uncertainty.

ContributorsPeplinski, Jacob Scott (Author) / Berisha, Visar (Thesis director) / Liss, Julie (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Using Goodness of Pronunciation Features for Spoken Nasality Detection

Description

Speech nasality disorders are characterized by abnormal resonance in the nasal cavity. Hypernasal speech is of particular interest, characterized by an inability to prevent improper nasalization of vowels, and poor articulation of plosive and fricative consonants, and can lead to negative communicative and social consequences. It can be associated with…

Speech nasality disorders are characterized by abnormal resonance in the nasal cavity. Hypernasal speech is of particular interest, characterized by an inability to prevent improper nasalization of vowels, and poor articulation of plosive and fricative consonants, and can lead to negative communicative and social consequences. It can be associated with a range of conditions, including cleft lip or palate, velopharyngeal dysfunction (a physical or neurological defective closure of the soft palate that regulates resonance between the oral and nasal cavity), dysarthria, or hearing impairment, and can also be an early indicator of developing neurological disorders such as ALS. Hypernasality is typically scored perceptually by a Speech Language Pathologist (SLP). Misdiagnosis could lead to inadequate treatment plans and poor treatment outcomes for a patient. Also, for some applications, particularly screening for early neurological disorders, the use of an SLP is not practical. Hence this work demonstrates a data-driven approach to objective assessment of hypernasality, through the use of Goodness of Pronunciation features. These features capture the overall precision of articulation of speaker on a phoneme-by-phoneme basis, allowing demonstrated models to achieve a Pearson correlation coefficient of 0.88 on low-nasality speakers, the population of most interest for this sort of technique. These results are comparable to milestone methods in this domain.

ContributorsSaxon, Michael Stephen (Author) / Berisha, Visar (Thesis director) / McDaniel, Troy (Committee member) / Electrical Engineering Program (Contributor, Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Relationship between formant variability and auditory-motor adaptation

Description

Previous studies have shown that experimentally implemented formant perturbations result in production of compensatory responses in the opposite direction of the perturbations. In this study, we investigated how participants adapt to a) auditory perturbations that shift formants to a specific point in the vowel space and hence remove variability of…

Previous studies have shown that experimentally implemented formant perturbations result in production of compensatory responses in the opposite direction of the perturbations. In this study, we investigated how participants adapt to a) auditory perturbations that shift formants to a specific point in the vowel space and hence remove variability of formants (focused perturbations), and b) auditory perturbations that preserve the natural variability of formants (uniform perturbations). We examined whether the degree of adaptation to focused perturbations was different from adaptation to uniform adaptations. We found that adaptation magnitude of the first formant (F1) was smaller in response to focused perturbations. However, F1 adaptation was initially moved in the same direction as the perturbation, and after several trials the F1 adaptation changed its course toward the opposite direction of the perturbation. We also found that adaptation of the second formant (F2) was smaller in response to focused perturbations than F2 responses to uniform perturbations. Overall, these results suggest that formant variability is an important component of speech, and that our central nervous system takes into account such variability to produce more accurate speech output.

ContributorsDittman, Jonathan William (Author) / Daliri, Ayoub (Thesis director) / Berisha, Visar (Committee member) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Diversity Promoting Online Sampling for Streaming Video Summarization

Description

Video summarization is gaining popularity in the technological culture, where positioning the mouse pointer on top of a video results in a quick overview of what the video is about. The algorithm usually selects frames in a time sequence through systematic sampling. Invariably, there are other applications like video surveillance,…

Video summarization is gaining popularity in the technological culture, where positioning the mouse pointer on top of a video results in a quick overview of what the video is about. The algorithm usually selects frames in a time sequence through systematic sampling. Invariably, there are other applications like video surveillance, web-based video surfing and video archival applications which can benefit from efficient and concise video summaries. In this project, we explored several clustering algorithms and how these can be combined and deconstructed to make summarization algorithm more efficient and relevant. We focused on two metrics to summarize: reducing error and redundancy in the summary. To reduce the error online k-means clustering algorithm was used; to reduce redundancy we applied two different methods: volume of convex hulls and the true diversity measure that is usually used in biological disciplines. The algorithm was efficient and computationally cost effective due to its online nature. The diversity maximization (or redundancy reduction) using technique of volume of convex hulls showed better results compared to other conventional methods on 50 different videos. For the true diversity measure, there has not been much work done on the nature of the measure in the context of video summarization. When we applied it, the algorithm stalled due to the true diversity saturating because of the inherent initialization present in the algorithm. We explored the nature of this measure to gain better understanding on how it can help to make summarization more intuitive and give the user a handle to customize the summary.

ContributorsMasroor, Ahnaf (Co-author) / Anirudh, Rushil (Co-author) / Turaga, Pavan (Thesis director) / Spanias, Andreas (Committee member) / School of Mathematical and Statistical Sciences (Contributor) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2017-05

A Non-Parametric Semi-Supervised f-Divergence

Description

Divergence functions are both highly useful and fundamental to many areas in information theory and machine learning, but require either parametric approaches or prior knowledge of labels on the full data set. This paper presents a method to estimate the divergence between two data sets in the absence of fully…

Divergence functions are both highly useful and fundamental to many areas in information theory and machine learning, but require either parametric approaches or prior knowledge of labels on the full data set. This paper presents a method to estimate the divergence between two data sets in the absence of fully labeled data. This semi-labeled case is common in many domains where labeling data by hand is expensive or time-consuming, or wherever large data sets are present. The theory derived in this paper is demonstrated on a simulated example, and then applied to a feature selection and classification problem from pathological speech analysis.

ContributorsGilton, Davis Leland (Author) / Berisha, Visar (Thesis director) / Cochran, Douglas (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Improved Finite Sample Estimate of A Nonparametric Divergence Measure

Description

This work details the bootstrap estimation of a nonparametric information divergence measure, the Dp divergence measure, using a power law model. To address the challenge posed by computing accurate divergence estimates given finite size data, the bootstrap approach is used in conjunction with a power law curve to calculate an…

This work details the bootstrap estimation of a nonparametric information divergence measure, the Dp divergence measure, using a power law model. To address the challenge posed by computing accurate divergence estimates given finite size data, the bootstrap approach is used in conjunction with a power law curve to calculate an asymptotic value of the divergence estimator. Monte Carlo estimates of Dp are found for increasing values of sample size, and a power law fit is used to relate the divergence estimates as a function of sample size. The fit is also used to generate a confidence interval for the estimate to characterize the quality of the estimate. We compare the performance of this method with the other estimation methods. The calculated divergence is applied to the binary classification problem. Using the inherent relation between divergence measures and classification error rate, an analysis of the Bayes error rate of several data sets is conducted using the asymptotic divergence estimate.

ContributorsKadambi, Pradyumna Sanjay (Author) / Berisha, Visar (Thesis director) / Bliss, Daniel (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Filtering by