Search Content

Geometry aware compressive analysis of human activities: application in a smart phone platform

Description

Continuous monitoring of sensor data from smart phones to identify human activities and gestures, puts a heavy load on the smart phone's power consumption. In this research study, the non-Euclidean geometry of the rich sensor data obtained from the user's smart phone is utilized to perform compressive analysis and efficient…

Continuous monitoring of sensor data from smart phones to identify human activities and gestures, puts a heavy load on the smart phone's power consumption. In this research study, the non-Euclidean geometry of the rich sensor data obtained from the user's smart phone is utilized to perform compressive analysis and efficient classification of human activities by employing machine learning techniques. We are interested in the generalization of classical tools for signal approximation to newer spaces, such as rotation data, which is best studied in a non-Euclidean setting, and its application to activity analysis. Attributing to the non-linear nature of the rotation data space, which involve a heavy overload on the smart phone's processor and memory as opposed to feature extraction on the Euclidean space, indexing and compaction of the acquired sensor data is performed prior to feature extraction, to reduce CPU overhead and thereby increase the lifetime of the battery with a little loss in recognition accuracy of the activities. The sensor data represented as unit quaternions, is a more intrinsic representation of the orientation of smart phone compared to Euler angles (which suffers from Gimbal lock problem) or the computationally intensive rotation matrices. Classification algorithms are employed to classify these manifold sequences in the non-Euclidean space. By performing customized indexing (using K-means algorithm) of the evolved manifold sequences before feature extraction, considerable energy savings is achieved in terms of smart phone's battery life.

ContributorsSivakumar, Aswin (Author) / Turaga, Pavan (Thesis advisor) / Spanias, Andreas (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Arizona State University (Publisher)

Created2014

Software techniques in the compromise of energy and accuracy

Description

Software has a great impact on the energy efficiency of any computing system--it can manage the components of a system efficiently or inefficiently. The impact of software is amplified in the context of a wearable computing system used for activity recognition. The design space this platform opens up is immense…

Software has a great impact on the energy efficiency of any computing system--it can manage the components of a system efficiently or inefficiently. The impact of software is amplified in the context of a wearable computing system used for activity recognition. The design space this platform opens up is immense and encompasses sensors, feature calculations, activity classification algorithms, sleep schedules, and transmission protocols. Design choices in each of these areas impact energy use, overall accuracy, and usefulness of the system. This thesis explores methods software can influence the trade-off between energy consumption and system accuracy. In general the more energy a system consumes the more accurate will be. We explore how finding the transitions between human activities is able to reduce the energy consumption of such systems without reducing much accuracy. We introduce the Log-likelihood Ratio Test as a method to detect transitions, and explore how choices of sensor, feature calculations, and parameters concerning time segmentation affect the accuracy of this method. We discovered an approximate 5X increase in energy efficiency could be achieved with only a 5% decrease in accuracy. We also address how a system's sleep mode, in which the processor enters a low-power state and sensors are turned off, affects a wearable computing platform that does activity recognition. We discuss the energy trade-offs in each stage of the activity recognition process. We find that careful analysis of these parameters can result in great increases in energy efficiency if small compromises in overall accuracy can be tolerated. We call this the ``Great Compromise.'' We found a 6X increase in efficiency with a 7% decrease in accuracy. We then consider how wireless transmission of data affects the overall energy efficiency of a wearable computing platform. We find that design decisions such as feature calculations and grouping size have a great impact on the energy consumption of the system because of the amount of data that is stored and transmitted. For example, storing and transmitting vector-based features such as FFT or DCT do not compress the signal and would use more energy than storing and transmitting the raw signal. The effect of grouping size on energy consumption depends on the feature. For scalar features energy consumption is proportional in the inverse of grouping size, so it's reduced as grouping size goes up. For features that depend on the grouping size, such as FFT, energy increases with the logarithm of grouping size, so energy consumption increases slowly as grouping size increases. We find that compressing data through activity classification and transition detection significantly reduces energy consumption and that the energy consumed for the classification overhead is negligible compared to the energy savings from data compression. We provide mathematical models of energy usage and data generation, and test our ideas using a mobile computing platform, the Texas Instruments Chronos watch.

ContributorsBoyd, Jeffrey Michael (Author) / Sundaram, Hari (Thesis advisor) / Li, Baoxin (Thesis advisor) / Shrivastava, Aviral (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2014

Largo al factotum" from Gioachino Rossini's Il barbiere di Siviglia: a study in ornamentation and performance practice

Description

From the time it was written, the aria "Largo al factotum" from Rossini's Il barbiere di Siviglia has been performed and ornamented in many different ways. The present study is an inventory and analysis of ornaments sung in 33 recordings from 1900 to 2011 and the major differences that they…

From the time it was written, the aria "Largo al factotum" from Rossini's Il barbiere di Siviglia has been performed and ornamented in many different ways. The present study is an inventory and analysis of ornaments sung in 33 recordings from 1900 to 2011 and the major differences that they exhibit one from another. The singers in this study are baritones with international careers, who have performed the role of Figaro either at the Metropolitan Opera (New York) or at La Scala (Milan). The study identifies and tracks some of the changes in the ornamentation of the aria by noting common traits and new approaches across the one hundred eleven years of practice illustrated by the recordings.

ContributorsBriggs, Andrew Nathan (Author) / Mills, Robert (Committee member) / Oldani, Robert (Committee member) / Dreyfoos, Dale (Committee member) / FitzPatrick, Carole (Committee member) / Ryan, Russell (Committee member) / Arizona State University (Publisher)

Created2014

Head rotation detection in marmoset monkeys

Description

Head movement is known to have the benefit of improving the accuracy of sound localization for humans and animals. Marmoset is a small bodied New World monkey species and it has become an emerging model for studying the auditory functions. This thesis aims to detect the horizontal and vertical…

Head movement is known to have the benefit of improving the accuracy of sound localization for humans and animals. Marmoset is a small bodied New World monkey species and it has become an emerging model for studying the auditory functions. This thesis aims to detect the horizontal and vertical rotation of head movement in marmoset monkeys.

Experiments were conducted in a sound-attenuated acoustic chamber. Head movement of marmoset monkey was studied under various auditory and visual stimulation conditions. With increasing complexity, these conditions are (1) idle, (2) sound-alone, (3) sound and visual signals, and (4) alert signal by opening and closing of the chamber door. All of these conditions were tested with either house light on or off. Infra-red camera with a frame rate of 90 Hz was used to capture of the head movement of monkeys. To assist the signal detection, two circular markers were attached to the top of monkey head. The data analysis used an image-based marker detection scheme. Images were processed using the Computation Vision Toolbox in Matlab. The markers and their positions were detected using blob detection techniques. Based on the frame-by-frame information of marker positions, the angular position, velocity and acceleration were extracted in horizontal and vertical planes. Adaptive Otsu Thresholding, Kalman filtering and bound setting for marker properties were used to overcome a number of challenges encountered during this analysis, such as finding image segmentation threshold, continuously tracking markers during large head movement, and false alarm detection.

The results show that the blob detection method together with Kalman filtering yielded better performances than other image based techniques like optical flow and SURF features .The median of the maximal head turn in the horizontal plane was in the range of 20 to 70 degrees and the median of the maximal velocity in horizontal plane was in the range of a few hundreds of degrees per second. In comparison, the natural alert signal - door opening and closing - evoked the faster head turns than other stimulus conditions. These results suggest that behaviorally relevant stimulus such as alert signals evoke faster head-turn responses in marmoset monkeys.

ContributorsSimhadri, Sravanthi (Author) / Zhou, Yi (Thesis advisor) / Turaga, Pavan (Thesis advisor) / Berisha, Visar (Committee member) / Arizona State University (Publisher)

Created2014

Semantic sparse learning in images and videos

Description

Many learning models have been proposed for various tasks in visual computing. Popular examples include hidden Markov models and support vector machines. Recently, sparse-representation-based learning methods have attracted a lot of attention in the computer vision field, largely because of their impressive performance in many applications. In the literature, many…

Many learning models have been proposed for various tasks in visual computing. Popular examples include hidden Markov models and support vector machines. Recently, sparse-representation-based learning methods have attracted a lot of attention in the computer vision field, largely because of their impressive performance in many applications. In the literature, many of such sparse learning methods focus on designing or application of some learning techniques for certain feature space without much explicit consideration on possible interaction between the underlying semantics of the visual data and the employed learning technique. Rich semantic information in most visual data, if properly incorporated into algorithm design, should help achieving improved performance while delivering intuitive interpretation of the algorithmic outcomes. My study addresses the problem of how to explicitly consider the semantic information of the visual data in the sparse learning algorithms. In this work, we identify four problems which are of great importance and broad interest to the community. Specifically, a novel approach is proposed to incorporate label information to learn a dictionary which is not only reconstructive but also discriminative; considering the formation process of face images, a novel image decomposition approach for an ensemble of correlated images is proposed, where a subspace is built from the decomposition and applied to face recognition; based on the observation that, the foreground (or salient) objects are sparse in input domain and the background is sparse in frequency domain, a novel and efficient spatio-temporal saliency detection algorithm is proposed to identify the salient regions in video; and a novel hidden Markov model learning approach is proposed by utilizing a sparse set of pairwise comparisons among the data, which is easier to obtain and more meaningful, consistent than tradition labels, in many scenarios, e.g., evaluating motion skills in surgical simulations. In those four problems, different types of semantic information are modeled and incorporated in designing sparse learning algorithms for the corresponding visual computing tasks. Several real world applications are selected to demonstrate the effectiveness of the proposed methods, including, face recognition, spatio-temporal saliency detection, abnormality detection, spatio-temporal interest point detection, motion analysis and emotion recognition. In those applications, data of different modalities are involved, ranging from audio signal, image to video. Experiments on large scale real world data with comparisons to state-of-art methods confirm the proposed approaches deliver salient advantages, showing adding those semantic information dramatically improve the performances of the general sparse learning methods.

ContributorsZhang, Qiang (Author) / Li, Baoxin (Thesis advisor) / Turaga, Pavan (Committee member) / Wang, Yalin (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2014

Context recognition methods using audio signals for human-machine interaction

Description

Audio signals, such as speech and ambient sounds convey rich information pertaining to a user’s activity, mood or intent. Enabling machines to understand this contextual information is necessary to bridge the gap in human-machine interaction. This is challenging due to its subjective nature, hence, requiring sophisticated techniques. This dissertation presents…

Audio signals, such as speech and ambient sounds convey rich information pertaining to a user’s activity, mood or intent. Enabling machines to understand this contextual information is necessary to bridge the gap in human-machine interaction. This is challenging due to its subjective nature, hence, requiring sophisticated techniques. This dissertation presents a set of computational methods, that generalize well across different conditions, for speech-based applications involving emotion recognition and keyword detection, and ambient sounds-based applications such as lifelogging.

The expression and perception of emotions varies across speakers and cultures, thus, determining features and classification methods that generalize well to different conditions is strongly desired. A latent topic models-based method is proposed to learn supra-segmental features from low-level acoustic descriptors. The derived features outperform state-of-the-art approaches over multiple databases. Cross-corpus studies are conducted to determine the ability of these features to generalize well across different databases. The proposed method is also applied to derive features from facial expressions; a multi-modal fusion overcomes the deficiencies of a speech only approach and further improves the recognition performance.

Besides affecting the acoustic properties of speech, emotions have a strong influence over speech articulation kinematics. A learning approach, which constrains a classifier trained over acoustic descriptors, to also model articulatory data is proposed here. This method requires articulatory information only during the training stage, thus overcoming the challenges inherent to large-scale data collection, while simultaneously exploiting the correlations between articulation kinematics and acoustic descriptors to improve the accuracy of emotion recognition systems.

Identifying context from ambient sounds in a lifelogging scenario requires feature extraction, segmentation and annotation techniques capable of efficiently handling long duration audio recordings; a complete framework for such applications is presented. The performance is evaluated on real world data and accompanied by a prototypical Android-based user interface.

The proposed methods are also assessed in terms of computation and implementation complexity. Software and field programmable gate array based implementations are considered for emotion recognition, while virtual platforms are used to model the complexities of lifelogging. The derived metrics are used to determine the feasibility of these methods for applications requiring real-time capabilities and low power consumption.

ContributorsShah, Mohit (Author) / Spanias, Andreas (Thesis advisor) / Chakrabarti, Chaitali (Thesis advisor) / Berisha, Visar (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2015

Commissioned works for cello by composers Christian Asplund and Joseph Hallman through analytical studies

Description

The commissioning and recording of music from living composers is a very important tradition in the art of music. The ability to work with living composers gives the performer insight into the music that is far beyond reading the notes on the page. For my research paper, I commissioned two…

The commissioning and recording of music from living composers is a very important tradition in the art of music. The ability to work with living composers gives the performer insight into the music that is far beyond reading the notes on the page. For my research paper, I commissioned two new works for the cello by the composers Joseph Hallman and Christian Asplund, in an effort to continue adding great pieces to the cello repertoire. This paper documents my experiences in finding and working with selected composers. It includes detailed descriptions of the pieces with practice and performance suggestions as well as recordings of the pieces. Commissioning new works often creates many first-hand artistic decisions for the performer as well as many new technical difficulties on the instrument. The two pieces commissioned offer insight into two different instrumentations: the sonata for cello and piano, and a solo cello suite. In this paper I describe various important aspects of these compositions and point out ways to make informed artistic decisions when approaching form, harmony, motive, and extended techniques on the cello. Providing this information on commissioning and collaborating with living composers will help continue this tradition into the future for classical music.

ContributorsKesler, Michelle (Contributor) / Landschoot, Thomas (Committee member) / Carpenter, Ellon (Committee member) / McLin, Katherine (Committee member) / Spring, Robert (Committee member) / Ryan, Russell (Committee member) / Arizona State University (Publisher)

Created2014

Applied interdisciplinary concepts for designing visual media within interactive neurorehabilitation systems

Description

As the application of interactive media systems expands to address broader problems in health, education and creative practice, they fall within a higher dimensional space for which it is inherently more complex to design. In response to this need an emerging area of interactive system design, referred to as experiential…

As the application of interactive media systems expands to address broader problems in health, education and creative practice, they fall within a higher dimensional space for which it is inherently more complex to design. In response to this need an emerging area of interactive system design, referred to as experiential media systems, applies hybrid knowledge synthesized across multiple disciplines to address challenges relevant to daily experience. Interactive neurorehabilitation (INR) aims to enhance functional movement therapy by integrating detailed motion capture with interactive feedback in a manner that facilitates engagement and sensorimotor learning for those who have suffered neurologic injury. While INR shows great promise to advance the current state of therapies, a cohesive media design methodology for INR is missing due to the present lack of substantial evidence within the field. Using an experiential media based approach to draw knowledge from external disciplines, this dissertation proposes a compositional framework for authoring visual media for INR systems across contexts and applications within upper extremity stroke rehabilitation. The compositional framework is applied across systems for supervised training, unsupervised training, and assisted reflection, which reflect the collective work of the Adaptive Mixed Reality Rehabilitation (AMRR) Team at Arizona State University, of which the author is a member. Formal structures and a methodology for applying them are described in detail for the visual media environments designed by the author. Data collected from studies conducted by the AMRR team to evaluate these systems in both supervised and unsupervised training contexts is also discussed in terms of the extent to which the application of the compositional framework is supported and which aspects require further investigation. The potential broader implications of the proposed compositional framework and methodology are the dissemination of interdisciplinary information to accelerate the informed development of INR applications and to demonstrate the potential benefit of generalizing integrative approaches, merging arts and science based knowledge, for other complex problems related to embodied learning.

ContributorsLehrer, Nicole (Author) / Rikakis, Thanassis (Committee member) / Olson, Loren (Committee member) / Wolf, Steven L. (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2014

Folk traditions in the solo piano music of Geirr Tveitt

Description

Geirr Tveitt (1908-1981) was a central figure of the national movement in Norwegian cultural life during the 1930s. He studied composition with masters such as Arthur Honegger, Heitor Villa-Lobos, and Nadia Boulanger, achieving international acclaim for many of his works. However, his native Norway was slow to follow this praise,…

Geirr Tveitt (1908-1981) was a central figure of the national movement in Norwegian cultural life during the 1930s. He studied composition with masters such as Arthur Honegger, Heitor Villa-Lobos, and Nadia Boulanger, achieving international acclaim for many of his works. However, his native Norway was slow to follow this praise, as post-World War II intellectuals disregarded anything that resembled nationalism. Tveitt's music was considered obsolete. He became isolated and withdrawn and died in 1981 after a house fire destroyed the manuscripts of nearly three hundred opuses, leaving only a handful of works, some of which were not yet published. Tveitt was raised in a remote part of Norway where the folk tradition was strong. Because of his close ties with the Hardanger community, he was able to bring to light many undiscovered folk tunes and exceptional practices. Tveitt utilizes this first-hand knowledge in his works for solo piano, and successfully combines them with his roots in both Germanic and Nordic traditions, eventually becoming a well-known and respected composer to the Norwegian people. However, he remains virtually unknown to the rest of the world. All of his music was deeply influenced by folk traditions and instruments. Techniques such as planing, drones, modal scales and passages, ornamentation, and simple melodies are pervasive in each piece, and are often the building blocks of main themes and motives. Because of the ambiguity of the status of many works, this paper examines only his published works for solo piano. Discussions of each piece will focus on folk influences within each work, including basic form, texture, and pianistic concerns.

ContributorsHunter, Karali (Author) / Meir, Baruch (Thesis advisor) / Carpenter, Ellon (Committee member) / Ryan, Russell (Committee member) / Arizona State University (Publisher)

Created2014

Toward a "green" organ: organ building and sustainability

Description

This study examines the effectiveness of various types of alternative resources in organ building in order to determine whether a change to more sustainable materials would benefit or hinder the overall sound production of the instrument. The qualities of the metals and woods currently used in organ production (e.g. lead,…

This study examines the effectiveness of various types of alternative resources in organ building in order to determine whether a change to more sustainable materials would benefit or hinder the overall sound production of the instrument. The qualities of the metals and woods currently used in organ production (e.g. lead, walnut, etc.) have been prized for centuries, so the substitution of different, more sustainable materials must be considered with regards to the sonic alterations, as well as the financial implications, of using alternatives to make the organ more “green.”

Five organ builders were interviewed regarding their views on sustainable materials. In addition, the author consulted the websites of nine national and four international organ builders for information about sustainability, indicating that each organ builder defines the term somewhat differently. Decisions on the woods and metals to be used in building or refurbishing an existing organ are based more on the visual appearance, the sound desired, and the potential for reuse of existing materials. A number of sustainability practices are currently in use by organ builders in the United States and Europe. These include the reuse of transportation boxes, efforts towards recycled metal and wood pipework, and the use of high efficiency lighting.

The investigations into sustainable practice that are presented here document a variety of approaches to sustainability in organ building in the United States, Canada and Europe. This research should assist in the evaluation of further efforts to conserve valuable resources while ensuring the high quality of sound that has characterized the organ throughout its long history.

ContributorsGregoire, Jonathan M (Author) / Marshall, Kimberly (Thesis advisor) / Feisst, Sabine (Committee member) / Ryan, Russell (Committee member) / Arizona State University (Publisher)

Created2014

Filtering by