Search Content

Edge Detection from Spectral Phase Data

Description

The detection and characterization of transients in signals is important in many wide-ranging applications from computer vision to audio processing. Edge detection on images is typically realized using small, local, discrete convolution kernels, but this is not possible when samples are measured directly in the frequency domain. The concentration factor…

The detection and characterization of transients in signals is important in many wide-ranging applications from computer vision to audio processing. Edge detection on images is typically realized using small, local, discrete convolution kernels, but this is not possible when samples are measured directly in the frequency domain. The concentration factor edge detection method was therefore developed to realize an edge detector directly from spectral data. This thesis explores the possibilities of detecting edges from the phase of the spectral data, that is, without the magnitude of the sampled spectral data. Prior work has demonstrated that the spectral phase contains particularly important information about underlying features in a signal. Furthermore, the concentration factor method yields some insight into the detection of edges in spectral phase data. An iterative design approach was taken to realize an edge detector using only the spectral phase data, also allowing for the design of an edge detector when phase data are intermittent or corrupted. Problem formulations showing the power of the design approach are given throughout. A post-processing scheme relying on the difference of multiple edge approximations yields a strong edge detector which is shown to be resilient under noisy, intermittent phase data. Lastly, a thresholding technique is applied to give an explicit enhanced edge detector ready to be used. Examples throughout are demonstrate both on signals and images.

ContributorsReynolds, Alexander Bryce (Author) / Gelb, Anne (Thesis director) / Cochran, Douglas (Committee member) / Viswanathan, Adityavikram (Committee member) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Downsampling for Efficient Parameter Choice in Ill-Posed Deconvolution Problems

Description

Deconvolution of noisy data is an ill-posed problem, and requires some form of regularization to stabilize its solution. Tikhonov regularization is the most common method used, but it depends on the choice of a regularization parameter λ which must generally be estimated using one of several common methods. These methods…

Deconvolution of noisy data is an ill-posed problem, and requires some form of regularization to stabilize its solution. Tikhonov regularization is the most common method used, but it depends on the choice of a regularization parameter λ which must generally be estimated using one of several common methods. These methods can be computationally intensive, so I consider their behavior when only a portion of the sampled data is used. I show that the results of these methods converge as the sampling resolution increases, and use this to suggest a method of downsampling to estimate λ. I then present numerical results showing that this method can be feasible, and propose future avenues of inquiry.

ContributorsHansen, Jakob Kristian (Author) / Renaut, Rosemary (Thesis director) / Cochran, Douglas (Committee member) / Barrett, The Honors College (Contributor) / School of Music (Contributor) / Economics Program in CLAS (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2015-05

Visual Surround Sound and its Applications

Description

The world of a hearing impaired person is much different than that of somebody capable of discerning different frequencies and magnitudes of sound waves via their ears. This is especially true when hearing impaired people play video games. In most video games, surround sound is fed through some sort of…

The world of a hearing impaired person is much different than that of somebody capable of discerning different frequencies and magnitudes of sound waves via their ears. This is especially true when hearing impaired people play video games. In most video games, surround sound is fed through some sort of digital output to headphones or speakers. Based on this information, the gamer can discern where a particular stimulus is coming from and whether or not that is a threat to their wellbeing within the virtual world. People with reliable hearing have a distinct advantage over hearing impaired people in the fact that they can gather information not just from what is in front of them, but from every angle relative to the way they're facing. The purpose of this project was to find a way to even the playing field, so that a person hard of hearing could also receive the sensory feedback that any other person would get while playing video games To do this, visual surround sound was created. This is a system that takes a surround sound input, and illuminates LEDs around the periphery of glasses based on the direction, frequency and amplitude of the audio wave. This provides the user with crucial information on the whereabouts of different elements within the game. In this paper, the research and development of Visual Surround Sound is discussed along with its viability in regards to a deaf person's ability to learn the technology, and decipher the visual cues.

ContributorsKadi, Danyal (Co-author) / Burrell, Nathaneal (Co-author) / Butler, Kristi (Co-author) / Wright, Gavin (Co-author) / Kosut, Oliver (Thesis director) / Bliss, Daniel (Committee member) / Barrett, The Honors College (Contributor) / Electrical Engineering Program (Contributor)

Created2015-05

An Algorithm for the Automatic Detection of Vocal Flutter

Description

Detecting early signs of neurodegeneration is vital for measuring the efficacy of pharmaceuticals and planning treatments for neurological diseases. This is especially true for Amyotrophic Lateral Sclerosis (ALS) where differences in symptom onset can be indicative of the prognosis. Because it can be measured noninvasively, changes in speech production have…

Detecting early signs of neurodegeneration is vital for measuring the efficacy of pharmaceuticals and planning treatments for neurological diseases. This is especially true for Amyotrophic Lateral Sclerosis (ALS) where differences in symptom onset can be indicative of the prognosis. Because it can be measured noninvasively, changes in speech production have been proposed as a promising indicator of neurological decline. However, speech changes are typically measured subjectively by a clinician. These perceptual ratings can vary widely between clinicians and within the same clinician on different patient visits, making clinical ratings less sensitive to subtle early indicators. In this paper, we propose an algorithm for the objective measurement of flutter, a quasi-sinusoidal modulation of fundamental frequency that manifests in the speech of some ALS patients. The algorithm detailed in this paper employs long-term average spectral analysis on the residual F0 track of a sustained phonation to detect the presence of flutter and is robust to longitudinal drifts in F0. The algorithm is evaluated on a longitudinal speech dataset of ALS patients at varying stages in their prognosis. Benchmarking with two stages of perceptual ratings provided by an expert speech pathologist indicate that the algorithm follows perceptual ratings with moderate accuracy and can objectively detect flutter in instances where the variability of the perceptual rating causes uncertainty.

ContributorsPeplinski, Jacob Scott (Author) / Berisha, Visar (Thesis director) / Liss, Julie (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Accurate Articulation of /r/: Relationships between Signal Processing Analysis of Speech and Ultrasound Images of the Tongue

Description

Research on /r/ production previously used formant analysis as the primary acoustic analysis, with particular focus on the low third formant in the speech signal. Prior imaging of speech used X-Ray, MRI, and electromagnetic midsagittal articulometer systems. More recently, the signal processing technique of Mel-log spectral plots has been used…

Research on /r/ production previously used formant analysis as the primary acoustic analysis, with particular focus on the low third formant in the speech signal. Prior imaging of speech used X-Ray, MRI, and electromagnetic midsagittal articulometer systems. More recently, the signal processing technique of Mel-log spectral plots has been used to study /r/ production in children and female adults. Ultrasound imaging of the tongue also has been used to image the tongue during speech production in both clinical and research settings. The current study attempts to describe /r/ production in three different allophonic contexts; vocalic, prevocalic, and postvocalic positions. Ultrasound analysis, formant analysis, Mel-log spectral plots, and /r/ duration were measured for /r/ production in 29 adult speakers (10 male, 19 female). A possible relationship between these variables was also explored. Results showed that the amount of superior constriction in the postvocalic /r/ allophone was significantly lower than the other /r/ allophones. Formant two was significantly lower and the distance between formant two and three was significantly higher for the prevocalic /r/ allophone. Vocalic /r/ had the longest average duration, while prevocalic /r/ had the shortest duration. Signal processing results revealed candidate Mel-bin values for accurate /r/ production for each allophone of /r/. The results indicate that allophones of /r/ can be distinguished based the different analyses. However, relationships between these analyses are still unclear. Future research is needed in order to gather more data on /r/ acoustics and articulation in order to find possible relationships between the analyses for /r/ production.

ContributorsHirsch, Megan Elizabeth (Author) / Weinhold, Juliet (Thesis director) / Gardner, Joshua (Committee member) / Department of Speech and Hearing Science (Contributor) / Department of Psychology (Contributor) / Barrett, The Honors College (Contributor)

Created2017-05

Somatosensory Modulation during Speech Planning

Description

Previous studies have found that the detection of near-threshold stimuli is decreased immediately before movement and throughout movement production. This has been suggested to occur through the use of the internal forward model processing an efferent copy of the motor command and creating a prediction that is used to cancel…

Previous studies have found that the detection of near-threshold stimuli is decreased immediately before movement and throughout movement production. This has been suggested to occur through the use of the internal forward model processing an efferent copy of the motor command and creating a prediction that is used to cancel out the resulting sensory feedback. Currently, there are no published accounts of the perception of tactile signals for motor tasks and contexts related to the lips during both speech planning and production. In this study, we measured the responsiveness of the somatosensory system during speech planning using light electrical stimulation below the lower lip by comparing perception during mixed speaking and silent reading conditions. Participants were asked to judge whether a constant near-threshold electrical stimulation (subject-specific intensity, 85% detected at rest) was present during different time points relative to an initial visual cue. In the speaking condition, participants overtly produced target words shown on a computer monitor. In the reading condition, participants read the same target words silently to themselves without any movement or sound. We found that detection of the stimulus was attenuated during speaking conditions while remaining at a constant level close to the perceptual threshold throughout the silent reading condition. Perceptual modulation was most intense during speech production and showed some attenuation just prior to speech production during the planning period of speech. This demonstrates that there is a significant decrease in the responsiveness of the somatosensory system during speech production as well as milliseconds before speech is even produced which has implications for speech disorders such as stuttering and schizophrenia with pronounced deficits in the somatosensory system.

ContributorsMcguffin, Brianna Jean (Author) / Daliri, Ayoub (Thesis director) / Liss, Julie (Committee member) / Department of Psychology (Contributor) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Startle-evoked movement in multi-jointed, two-dimensional reaching tasks

Description

Previous research has shown that a loud acoustic stimulus can trigger an individual's prepared movement plan. This movement response is referred to as a startle-evoked movement (SEM). SEM has been observed in the stroke survivor population where results have shown that SEM enhances single joint movements that are usually performed…

Previous research has shown that a loud acoustic stimulus can trigger an individual's prepared movement plan. This movement response is referred to as a startle-evoked movement (SEM). SEM has been observed in the stroke survivor population where results have shown that SEM enhances single joint movements that are usually performed with difficulty. While the presence of SEM in the stroke survivor population advances scientific understanding of movement capabilities following a stroke, published studies using the SEM phenomenon only examined one joint. The ability of SEM to generate multi-jointed movements is understudied and consequently limits SEM as a potential therapy tool. In order to apply SEM as a therapy tool however, the biomechanics of the arm in multi-jointed movement planning and execution must be better understood. Thus, the objective of our study was to evaluate if SEM could elicit multi-joint reaching movements that were accurate in an unrestrained, two-dimensional workspace. Data was collected from ten subjects with no previous neck, arm, or brain injury. Each subject performed a reaching task to five Targets that were equally spaced in a semi-circle to create a two-dimensional workspace. The subject reached to each Target following a sequence of two non-startling acoustic stimuli cues: "Get Ready" and "Go". A loud acoustic stimuli was randomly substituted for the "Go" cue. We hypothesized that SEM is accessible and accurate for unrestricted multi-jointed reaching tasks in a functional workspace and is therefore independent of movement direction. Our results found that SEM is possible in all five Target directions. The probability of evoking SEM and the movement kinematics (i.e. total movement time, linear deviation, average velocity) to each Target are not statistically different. Thus, we conclude that SEM is possible in a functional workspace and is not dependent on where arm stability is maximized. Moreover, coordinated preparation and storage of a multi-jointed movement is indeed possible.

ContributorsOssanna, Meilin Ryan (Author) / Honeycutt, Claire (Thesis director) / Schaefer, Sydney (Committee member) / Harrington Bioengineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2016-12

Designing concentration factors to detect jump discontinuities from non-uniform Fourier data

Description

Edge detection plays a significant role in signal processing and image reconstruction applications where it is used to identify important features in the underlying signal or image. In some of these applications, such as magnetic resonance imaging (MRI), data are sampled in the Fourier domain. When the data are sampled…

Edge detection plays a significant role in signal processing and image reconstruction applications where it is used to identify important features in the underlying signal or image. In some of these applications, such as magnetic resonance imaging (MRI), data are sampled in the Fourier domain. When the data are sampled uniformly, a variety of algorithms can be used to efficiently extract the edges of the underlying images. However, in cases where the data are sampled non-uniformly, such as in non-Cartesian MRI, standard inverse Fourier transformation techniques are no longer suitable. Methods exist for handling these types of sampling patterns, but are often ill-equipped for cases where data are highly non-uniform. This thesis further develops an existing approach to discontinuity detection, the use of concentration factors. Previous research shows that the concentration factor technique can successfully determine jump discontinuities in non-uniform data. However, as the distribution diverges further away from uniformity so does the efficacy of the identification. This thesis proposes a method for reverse-engineering concentration factors specifically tailored to non-uniform data by employing the finite Fourier frame approximation. Numerical results indicate that this design method produces concentration factors which can more precisely identify jump locations than those previously developed.

ContributorsMoore, Rachael (Author) / Gelb, Anne (Thesis director) / Davis, Jacueline (Committee member) / Barrett, The Honors College (Contributor)

Created2015-05

Cost-Effective Proximity Object Sensing

Description

The increasing presence and affordability of sensors provides the opportunity to make novel and creative designs for underserved markets like the legally blind. Here we explore how mathematical methods and device coordination can be utilized to improve the functionality of inexpensive proximity sensing electronics in order to create designs that…

The increasing presence and affordability of sensors provides the opportunity to make novel and creative designs for underserved markets like the legally blind. Here we explore how mathematical methods and device coordination can be utilized to improve the functionality of inexpensive proximity sensing electronics in order to create designs that are versatile, durable, low cost, and simple. Devices utilizing various acoustic and electromagnetic wave frequencies like ultrasonic rangefinders, radars, Lidar rangefinders, webcams, and infrared rangefinders and the concepts of Sensor Fusion, Frequency Modulated Continuous Wave radar, and Phased Arrays were explored. The effects of various factors on the propagation of different wave signals was also investigated. The devices selected to be incorporated into designs were the HB100 DRO Radar Doppler Sensor (as an FMCW radar), HC-SR04 Ultrasonic Sensor, and Maxbotix Ultrasonic Rangefinder \u2014 EZ3. Three designs were ultimately developed and dubbed the "Rad-Son Fusion", the "Tri-Beam Scanner", and the "Dual-Receiver Ranger". The "Rad-Son Fusion" employs the Sensor Fusion of an FMCW radar and Ultrasonic sensor through a weighted average of the distance reading from the two sensors. The "Tri-Beam Scanner" utilizes a beam-forming Digital Phased Array of ultrasonic sensors to scan its surroundings. The "Dual-Receiver Ranger" uses the convolved result from to two modified HC-SR04 sensors to determine the time of flight and ultimately an object's distance. After conducting hardware experiments to determine the feasibility of each design, the "Dual-Receiver Ranger" was prototyped and tested to demonstrate the potential of the concept. The designs were later compared based on proposed requirements and possible improvements and challenges associated with the designs are discussed.

ContributorsFeinglass, Joshua Forster (Author) / Goryll, Michael (Thesis director) / Reisslein, Martin (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Predicting /r/ Acquisition: A Longitudinal Analysis Using Signal Processing

Description

The purpose of this longitudinal study was to predict /r/ acquisition using acoustic signal processing. 19 children, aged 5-7 with inaccurate /r/, were followed until they turned 8 or acquired /r/, whichever came first. Acoustic and descriptive data from 14 participants were analyzed. The remaining 5 children continued to be…

The purpose of this longitudinal study was to predict /r/ acquisition using acoustic signal processing. 19 children, aged 5-7 with inaccurate /r/, were followed until they turned 8 or acquired /r/, whichever came first. Acoustic and descriptive data from 14 participants were analyzed. The remaining 5 children continued to be followed. The study analyzed differences in spectral energy at the baseline acoustic signals of participants who eventually acquired /r/ compared to that of those who did not acquire /r/. Results indicated significant differences between groups in the baseline signals for vocalic and postvocalic /r/, suggesting that the acquisition of certain allophones may be predictable. Participants’ articulatory changes made during the progression of acquisition were also analyzed spectrally. A retrospective analysis described the pattern in which /r/ allophones were acquired, proposing that vocalic /r/ and the postvocalic variant of consonantal /r/ may be acquired prior to prevocalic /r/, and /r/ followed by low vowels may be acquired before /r/ followed by high vowels, although individual variations exist.

ContributorsConger, Sarah Grace (Author) / Weinhold, Juliet (Thesis director) / Daliri, Ayoub (Committee member) / Bruce, Laurel (Committee member) / College of Health Solutions (Contributor, Contributor, Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

Filtering by