Search Content

Audio processing and loudness estimation algorithms with iOS simulations

Description

The processing power and storage capacity of portable devices have improved considerably over the past decade. This has motivated the implementation of sophisticated audio and other signal processing algorithms on such mobile devices. Of particular interest in this thesis is audio/speech processing based on perceptual criteria. Specifically, estimation of parameters…

The processing power and storage capacity of portable devices have improved considerably over the past decade. This has motivated the implementation of sophisticated audio and other signal processing algorithms on such mobile devices. Of particular interest in this thesis is audio/speech processing based on perceptual criteria. Specifically, estimation of parameters from human auditory models, such as auditory patterns and loudness, involves computationally intensive operations which can strain device resources. Hence, strategies for implementing computationally efficient human auditory models for loudness estimation have been studied in this thesis. Existing algorithms for reducing computations in auditory pattern and loudness estimation have been examined and improved algorithms have been proposed to overcome limitations of these methods. In addition, real-time applications such as perceptual loudness estimation and loudness equalization using auditory models have also been implemented. A software implementation of loudness estimation on iOS devices is also reported in this thesis. In addition to the loudness estimation algorithms and software, in this thesis project we also created new illustrations of speech and audio processing concepts for research and education. As a result, a new suite of speech/audio DSP functions was developed and integrated as part of the award-winning educational iOS App 'iJDSP." These functions are described in detail in this thesis. Several enhancements in the architecture of the application have also been introduced for providing the supporting framework for speech/audio processing. Frame-by-frame processing and visualization functionalities have been developed to facilitate speech/audio processing. In addition, facilities for easy sound recording, processing and audio rendering have also been developed to provide students, practitioners and researchers with an enriched DSP simulation tool. Simulations and assessments have been also developed for use in classes and training of practitioners and students.

ContributorsKalyanasundaram, Girish (Author) / Spanias, Andreas S (Thesis advisor) / Tepedelenlioğlu, Cihan (Committee member) / Berisha, Visar (Committee member) / Arizona State University (Publisher)

Created2013

Context recognition methods using audio signals for human-machine interaction

Description

Audio signals, such as speech and ambient sounds convey rich information pertaining to a user’s activity, mood or intent. Enabling machines to understand this contextual information is necessary to bridge the gap in human-machine interaction. This is challenging due to its subjective nature, hence, requiring sophisticated techniques. This dissertation presents…

Audio signals, such as speech and ambient sounds convey rich information pertaining to a user’s activity, mood or intent. Enabling machines to understand this contextual information is necessary to bridge the gap in human-machine interaction. This is challenging due to its subjective nature, hence, requiring sophisticated techniques. This dissertation presents a set of computational methods, that generalize well across different conditions, for speech-based applications involving emotion recognition and keyword detection, and ambient sounds-based applications such as lifelogging.

The expression and perception of emotions varies across speakers and cultures, thus, determining features and classification methods that generalize well to different conditions is strongly desired. A latent topic models-based method is proposed to learn supra-segmental features from low-level acoustic descriptors. The derived features outperform state-of-the-art approaches over multiple databases. Cross-corpus studies are conducted to determine the ability of these features to generalize well across different databases. The proposed method is also applied to derive features from facial expressions; a multi-modal fusion overcomes the deficiencies of a speech only approach and further improves the recognition performance.

Besides affecting the acoustic properties of speech, emotions have a strong influence over speech articulation kinematics. A learning approach, which constrains a classifier trained over acoustic descriptors, to also model articulatory data is proposed here. This method requires articulatory information only during the training stage, thus overcoming the challenges inherent to large-scale data collection, while simultaneously exploiting the correlations between articulation kinematics and acoustic descriptors to improve the accuracy of emotion recognition systems.

Identifying context from ambient sounds in a lifelogging scenario requires feature extraction, segmentation and annotation techniques capable of efficiently handling long duration audio recordings; a complete framework for such applications is presented. The performance is evaluated on real world data and accompanied by a prototypical Android-based user interface.

The proposed methods are also assessed in terms of computation and implementation complexity. Software and field programmable gate array based implementations are considered for emotion recognition, while virtual platforms are used to model the complexities of lifelogging. The derived metrics are used to determine the feasibility of these methods for applications requiring real-time capabilities and low power consumption.

ContributorsShah, Mohit (Author) / Spanias, Andreas (Thesis advisor) / Chakrabarti, Chaitali (Thesis advisor) / Berisha, Visar (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2015

Multiple detection and tracking in complex time-varying environments

Description

This work considers the problem of multiple detection and tracking in two complex time-varying environments, urban terrain and underwater. Tracking multiple radar targets in urban environments is rst investigated by exploiting multipath signal returns, wideband underwater acoustic (UWA) communications channels are estimated using adaptive learning methods, and multiple UWA communications…

This work considers the problem of multiple detection and tracking in two complex time-varying environments, urban terrain and underwater. Tracking multiple radar targets in urban environments is rst investigated by exploiting multipath signal returns, wideband underwater acoustic (UWA) communications channels are estimated using adaptive learning methods, and multiple UWA communications users are detected by designing the transmit signal to match the environment. For the urban environment, a multi-target tracking algorithm is proposed that integrates multipath-to-measurement association and the probability hypothesis density method implemented using particle filtering. The algorithm is designed to track an unknown time-varying number of targets by extracting information from multiple measurements due to multipath returns in the urban terrain. The path likelihood probability is calculated by considering associations between measurements and multipath returns, and an adaptive clustering algorithm is used to estimate the number of target and their corresponding parameters. The performance of the proposed algorithm is demonstrated for different multiple target scenarios and evaluated using the optimal subpattern assignment metric. The underwater environment provides a very challenging communication channel due to its highly time-varying nature, resulting in large distortions due to multipath and Doppler-scaling, and frequency-dependent path loss. A model-based wideband UWA channel estimation algorithm is first proposed to estimate the channel support and the wideband spreading function coefficients. A nonlinear frequency modulated signaling scheme is proposed that is matched to the wideband characteristics of the underwater environment. Constraints on the signal parameters are derived to optimally reduce multiple access interference and the UWA channel effects. The signaling scheme is compared to a code division multiple access (CDMA) scheme to demonstrate its improved bit error rate performance. The overall multi-user communication system performance is finally analyzed by first estimating the UWA channel and then designing the signaling scheme for multiple communications users.

ContributorsZhou, Meng (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Tepedelenlioğlu, Cihan (Committee member) / Kovvali, Narayan (Committee member) / Berisha, Visar (Committee member) / Arizona State University (Publisher)

Created2014

Radar target tracking with varying levels of communications interference for shared spectrum access

Description

As the demand for spectrum sharing between radar and communications systems is steadily increasing, the coexistence between the two systems is a growing and very challenging problem. Radar tracking in the presence of strong communications interference can result in low probability of detection even when sequential Monte Carlo

tracking methods…

As the demand for spectrum sharing between radar and communications systems is steadily increasing, the coexistence between the two systems is a growing and very challenging problem. Radar tracking in the presence of strong communications interference can result in low probability of detection even when sequential Monte Carlo

tracking methods such as the particle filter (PF) are used that better match the target kinematic model. In particular, the tracking performance can fluctuate as the power level of the communications interference can vary dynamically and unpredictably.

This work proposes to integrate the interacting multiple model (IMM) selection approach with the PF tracker to allow for dynamic variations in the power spectral density of the communications interference. The model switching allows for a necessary transition between different communications interference power spectral density (CI-PSD) values in order to reduce prediction errors. Simulations demonstrate the high performance of the integrated approach with as many as six dynamic CI-PSD value changes during the target track. For low signal-to-interference-plus-noise ratios, the derivation for estimating the high power levels of the communications interference is provided; the estimated power levels would be dynamically used in the IMM when integrated with a track-before-detect filter that is better matched to low SINR tracking applications.

ContributorsZhou, Jian (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Kovvali, Narayan (Committee member) / Berisha, Visar (Committee member) / Arizona State University (Publisher)

Created2015

Development of a Game-Based Intervention to Promote HPV Vaccination Among Adolescents: A Qualitative Analysis

Description

Purpose: This qualitative research aimed to create a developmentally and gender-appropriate game-based intervention to promote Human Papillomavirus (HPV) vaccination in adolescents. Background: Ranking as the most common sexually transmitted infection, about 80 million Americans are currently infected by HPV, and it continues to increase with an estimated 14 million new…

Purpose: This qualitative research aimed to create a developmentally and gender-appropriate game-based intervention to promote Human Papillomavirus (HPV) vaccination in adolescents. Background: Ranking as the most common sexually transmitted infection, about 80 million Americans are currently infected by HPV, and it continues to increase with an estimated 14 million new cases yearly. Certain types of HPV have been significantly associated with cervical, vaginal, and vulvar cancers in women; penile cancers in men; and oropharyngeal and anal cancers in both men and women. Despite HPV vaccination being one of the most effective methods in preventing HPV-associated cancers, vaccination rates remain suboptimal in adolescents. Game-based intervention, a novel medium that is popular with adolescents, has been shown to be effective in promoting health behaviors. Methods: Sample/Sampling. We used purposeful sampling to recruit eight adolescent-parent dyads (N = 16) which represented both sexes (4 boys, 4 girls) and different racial/ethnic groups (White, Black, Latino, Asian American) in the United States. The inclusion criteria for the dyads were: (1) a child aged 11-14 years and his/her parent, and (2) ability to speak, read, write, and understand English. Procedure. After eligible families consented to their participation, semi-structured interviews (each 60-90 minutes long) were conducted with each adolescent-parent dyad in a quiet and private room. Each dyad received $50 to acknowledge their time and effort. Measure. The interview questions consisted of two parts: (a) those related to game design, functioning, and feasibility of implementation; (b) those related to theoretical constructs of the Health Belief Model (HBM) and the Theory of Planned Behavior (TPB). Data analysis. The interviews were audio-recorded with permission and manually transcribed into textual data. Two researchers confirmed the verbatim transcription. We use pre-developed codes to identify each participant’s responses and organize data and develop themes based on the HBM and TPB constructs. After the analysis was completed, three researchers in the team reviewed the results and discussed the discrepancies until a consensus is reached. Results: The findings suggested that the most common motivating factors for adolescents’ HPV vaccination were its effectiveness, benefits, convenience, affordable cost, reminders via text, and recommendation by a health care provider. Regarding the content included in the HPV game, participants suggested including information about who and when should receive the vaccine, what is HPV and the vaccination, what are the consequences if infected, the side effects of the vaccine, and where to receive the vaccine. The preferred game design elements were: 15 minutes long, stories about fighting or action, option to choose characters/avatars, motivating factors (i.e., rewards such as allowing users to advance levels and receive coins when correctly answering questions), use of a portable electronic device (e.g., tablet) to deliver the education. Participants were open to multiplayer function which assists in a facilitated conversation about HPV and the HPV vaccine. Overall, the participants concluded enthusiasm for an interactive yet engaging game-based intervention to learn about the HPV vaccine with the goal to increase HPV vaccination in adolescents. Implications: Tailored educational games have the potential to decrease the stigma of HPV and HPV vaccination, increasing communication between the adolescent, parent, and healthcare provider, as well as increase the overall HPV vaccination rate.

ContributorsBeaman, Abigail Marie (Author) / Chen, Angela Chia-Chen (Thesis director) / Amresh, Ashish (Committee member) / Edson College of Nursing and Health Innovation (Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

Prediction at the Tip of Your Fingers: A Machine Learning Approach to Predict Parkinson's Disease and the Effects of Medication

Description

This paper serves to report the research performed towards detecting PD and the effects of medication through the use of machine learning and finger tapping data collected through mobile devices. The primary objective for this research is to prototype a PD classification model and a medication classification model that predict…

This paper serves to report the research performed towards detecting PD and the effects of medication through the use of machine learning and finger tapping data collected through mobile devices. The primary objective for this research is to prototype a PD classification model and a medication classification model that predict the following: the individual’s disease status and the medication intake time relative to performing the finger-tapping activity, respectively.

ContributorsGin, Taylor (Author) / McCarthy, Alexandra (Co-author) / Berisha, Visar (Thesis director) / Baumann, Alicia (Committee member) / Barrett, The Honors College (Contributor) / Electrical Engineering Program (Contributor)

Created2022-05

Prediction at the Tip of Your Fingers: A Machine Learning Approach to Predict Parkinson's Disease and the Effects of Medication

Description

This paper serves to report the research performed towards detecting PD and the effects of medication through the use of machine learning and finger tapping data collected through mobile devices. The primary objective for this research is to prototype a PD classification model and a medication classification model that predict…

This paper serves to report the research performed towards detecting PD and the effects of medication through the use of machine learning and finger tapping data collected through mobile devices. The primary objective for this research is to prototype a PD classification model and a medication classification model that predict the following: the individual’s disease status and the medication intake time relative to performing the finger-tapping activity, respectively.

ContributorsMcCarthy, Alexandra (Author) / Gin, Taylor (Co-author) / Berisha, Visar (Thesis director) / Baumann, Alicia (Committee member) / Barrett, The Honors College (Contributor) / Electrical Engineering Program (Contributor)

Created2022-05

Filtering by

Audio processing and loudness estimation algorithms with iOS simulations

Context recognition methods using audio signals for human-machine interaction

Multiple detection and tracking in complex time-varying environments

Radar target tracking with varying levels of communications interference for shared spectrum access

Development of a Game-Based Intervention to Promote HPV Vaccination Among Adolescents: A Qualitative Analysis

Prediction at the Tip of Your Fingers: A Machine Learning Approach to Predict Parkinson's Disease and the Effects of Medication

Prediction at the Tip of Your Fingers: A Machine Learning Approach to Predict Parkinson's Disease and the Effects of Medication