This collection includes both ASU Theses and Dissertations, submitted by graduate students, and the Barrett, Honors College theses submitted by undergraduate students. 

Displaying 1 - 6 of 6
Filtering by

Clear all filters

137081-Thumbnail Image.png
Description
Passive radar can be used to reduce the demand for radio frequency spectrum bandwidth. This paper will explain how a MATLAB simulation tool was developed to analyze the feasibility of using passive radar with digitally modulated communication signals. The first stage of the simulation creates a binary phase-shift keying (BPSK)

Passive radar can be used to reduce the demand for radio frequency spectrum bandwidth. This paper will explain how a MATLAB simulation tool was developed to analyze the feasibility of using passive radar with digitally modulated communication signals. The first stage of the simulation creates a binary phase-shift keying (BPSK) signal, quadrature phase-shift keying (QPSK) signal, or digital terrestrial television (DTTV) signal. A scenario is then created using user defined parameters that simulates reception of the original signal on two different channels, a reference channel and a surveillance channel. The signal on the surveillance channel is delayed and Doppler shifted according to a point target scattering profile. An ambiguity function detector is implemented to identify the time delays and Doppler shifts associated with reflections off of the targets created. The results of an example are included in this report to demonstrate the simulation capabilities.
ContributorsScarborough, Gillian Donnelly (Author) / Cochran, Douglas (Thesis director) / Berisha, Visar (Committee member) / Wang, Chao (Committee member) / Barrett, The Honors College (Contributor) / Electrical Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor)
Created2014-05
Description
Through decades of clinical progress, cochlear implants have brought the world of speech and language to thousands of profoundly deaf patients. However, the technology has many possible areas for improvement, including providing information of non-linguistic cues, also called indexical properties of speech. The field of sensory substitution, providing information relating

Through decades of clinical progress, cochlear implants have brought the world of speech and language to thousands of profoundly deaf patients. However, the technology has many possible areas for improvement, including providing information of non-linguistic cues, also called indexical properties of speech. The field of sensory substitution, providing information relating one sense to another, offers a potential avenue to further assist those with cochlear implants, in addition to the promise they hold for those without existing aids. A user study with a vibrotactile device is evaluated to exhibit the effectiveness of this approach in an auditory gender discrimination task. Additionally, preliminary computational work is included that demonstrates advantages and limitations encountered when expanding the complexity of future implementations.
ContributorsButts, Austin McRae (Author) / Helms Tillery, Stephen (Thesis advisor) / Berisha, Visar (Committee member) / Buneo, Christopher (Committee member) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)
Created2015
152801-Thumbnail Image.png
Description
Everyday speech communication typically takes place face-to-face. Accordingly, the task of perceiving speech is a multisensory phenomenon involving both auditory and visual information. The current investigation examines how visual information influences recognition of dysarthric speech. It also explores where the influence of visual information is dependent upon age. Forty adults

Everyday speech communication typically takes place face-to-face. Accordingly, the task of perceiving speech is a multisensory phenomenon involving both auditory and visual information. The current investigation examines how visual information influences recognition of dysarthric speech. It also explores where the influence of visual information is dependent upon age. Forty adults participated in the study that measured intelligibility (percent words correct) of dysarthric speech in auditory versus audiovisual conditions. Participants were then separated into two groups: older adults (age range 47 to 68) and young adults (age range 19 to 36) to examine the influence of age. Findings revealed that all participants, regardless of age, improved their ability to recognize dysarthric speech when visual speech was added to the auditory signal. The magnitude of this benefit, however, was greater for older adults when compared with younger adults. These results inform our understanding of how visual speech information influences understanding of dysarthric speech.
ContributorsFall, Elizabeth (Author) / Liss, Julie (Thesis advisor) / Berisha, Visar (Committee member) / Gray, Shelley (Committee member) / Arizona State University (Publisher)
Created2014
153488-Thumbnail Image.png
Description
Audio signals, such as speech and ambient sounds convey rich information pertaining to a user’s activity, mood or intent. Enabling machines to understand this contextual information is necessary to bridge the gap in human-machine interaction. This is challenging due to its subjective nature, hence, requiring sophisticated techniques. This dissertation presents

Audio signals, such as speech and ambient sounds convey rich information pertaining to a user’s activity, mood or intent. Enabling machines to understand this contextual information is necessary to bridge the gap in human-machine interaction. This is challenging due to its subjective nature, hence, requiring sophisticated techniques. This dissertation presents a set of computational methods, that generalize well across different conditions, for speech-based applications involving emotion recognition and keyword detection, and ambient sounds-based applications such as lifelogging.

The expression and perception of emotions varies across speakers and cultures, thus, determining features and classification methods that generalize well to different conditions is strongly desired. A latent topic models-based method is proposed to learn supra-segmental features from low-level acoustic descriptors. The derived features outperform state-of-the-art approaches over multiple databases. Cross-corpus studies are conducted to determine the ability of these features to generalize well across different databases. The proposed method is also applied to derive features from facial expressions; a multi-modal fusion overcomes the deficiencies of a speech only approach and further improves the recognition performance.

Besides affecting the acoustic properties of speech, emotions have a strong influence over speech articulation kinematics. A learning approach, which constrains a classifier trained over acoustic descriptors, to also model articulatory data is proposed here. This method requires articulatory information only during the training stage, thus overcoming the challenges inherent to large-scale data collection, while simultaneously exploiting the correlations between articulation kinematics and acoustic descriptors to improve the accuracy of emotion recognition systems.

Identifying context from ambient sounds in a lifelogging scenario requires feature extraction, segmentation and annotation techniques capable of efficiently handling long duration audio recordings; a complete framework for such applications is presented. The performance is evaluated on real world data and accompanied by a prototypical Android-based user interface.

The proposed methods are also assessed in terms of computation and implementation complexity. Software and field programmable gate array based implementations are considered for emotion recognition, while virtual platforms are used to model the complexities of lifelogging. The derived metrics are used to determine the feasibility of these methods for applications requiring real-time capabilities and low power consumption.
ContributorsShah, Mohit (Author) / Spanias, Andreas (Thesis advisor) / Chakrabarti, Chaitali (Thesis advisor) / Berisha, Visar (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)
Created2015
156177-Thumbnail Image.png
Description
The activation of the primary motor cortex (M1) is common in speech perception tasks that involve difficult listening conditions. Although the challenge of recognizing and discriminating non-native speech sounds appears to be an instantiation of listening under difficult circumstances, it is still unknown if M1 recruitment is facilitatory of second

The activation of the primary motor cortex (M1) is common in speech perception tasks that involve difficult listening conditions. Although the challenge of recognizing and discriminating non-native speech sounds appears to be an instantiation of listening under difficult circumstances, it is still unknown if M1 recruitment is facilitatory of second language speech perception. The purpose of this study was to investigate the role of M1 associated with speech motor centers in processing acoustic inputs in the native (L1) and second language (L2), using repetitive Transcranial Magnetic Stimulation (rTMS) to selectively alter neural activity in M1. Thirty-six healthy English/Spanish bilingual subjects participated in the experiment. The performance on a listening word-to-picture matching task was measured before and after real- and sham-rTMS to the orbicularis oris (lip muscle) associated M1. Vowel Space Area (VSA) obtained from recordings of participants reading a passage in L2 before and after real-rTMS, was calculated to determine its utility as an rTMS aftereffect measure. There was high variability in the aftereffect of the rTMS protocol to the lip muscle among the participants. Approximately 50% of participants showed an inhibitory effect of rTMS, evidenced by smaller motor evoked potentials (MEPs) area, whereas the other 50% had a facilitatory effect, with larger MEPs. This suggests that rTMS has a complex influence on M1 excitability, and relying on grand-average results can obscure important individual differences in rTMS physiological and functional outcomes. Evidence of motor support to word recognition in the L2 was found. Participants showing an inhibitory aftereffect of rTMS on M1 produced slower and less accurate responses in the L2 task, whereas those showing a facilitatory aftereffect of rTMS on M1 produced more accurate responses in L2. In contrast, no effect of rTMS was found on the L1, where accuracy and speed were very similar after sham- and real-rTMS. The L2 VSA measure was indicative of the aftereffect of rTMS to M1 associated with speech production, supporting its utility as an rTMS aftereffect measure. This result revealed an interesting and novel relation between cerebral motor cortex activation and speech measures.
ContributorsBarragan, Beatriz (Author) / Liss, Julie (Thesis advisor) / Berisha, Visar (Committee member) / Rogalsky, Corianne (Committee member) / Restrepo, Adelaida (Committee member) / Arizona State University (Publisher)
Created2018
157701-Thumbnail Image.png
Description
Eigenvalues of the Gram matrix formed from received data frequently appear in sufficient detection statistics for multi-channel detection with Generalized Likelihood Ratio (GLRT) and Bayesian tests. In a frequently presented model for passive radar, in which the null hypothesis is that the channels are independent and contain only complex white

Eigenvalues of the Gram matrix formed from received data frequently appear in sufficient detection statistics for multi-channel detection with Generalized Likelihood Ratio (GLRT) and Bayesian tests. In a frequently presented model for passive radar, in which the null hypothesis is that the channels are independent and contain only complex white Gaussian noise and the alternative hypothesis is that the channels contain a common rank-one signal in the mean, the GLRT statistic is the largest eigenvalue $\lambda_1$ of the Gram matrix formed from data. This Gram matrix has a Wishart distribution. Although exact expressions for the distribution of $\lambda_1$ are known under both hypotheses, numerically calculating values of these distribution functions presents difficulties in cases where the dimension of the data vectors is large. This dissertation presents tractable methods for computing the distribution of $\lambda_1$ under both the null and alternative hypotheses through a technique of expanding known expressions for the distribution of $\lambda_1$ as inner products of orthogonal polynomials. These newly presented expressions for the distribution allow for computation of detection thresholds and receiver operating characteristic curves to arbitrary precision in floating point arithmetic. This represents a significant advancement over the state of the art in a problem that could previously only be addressed by Monte Carlo methods.
ContributorsJones, Scott, Ph.D (Author) / Cochran, Douglas (Thesis advisor) / Berisha, Visar (Committee member) / Bliss, Daniel (Committee member) / Kosut, Oliver (Committee member) / Richmond, Christ (Committee member) / Arizona State University (Publisher)
Created2019