Theses and Dissertations
Filtering by
- All Subjects: activity recognition
- All Subjects: lifelogging
- Creators: Turaga, Pavan
The expression and perception of emotions varies across speakers and cultures, thus, determining features and classification methods that generalize well to different conditions is strongly desired. A latent topic models-based method is proposed to learn supra-segmental features from low-level acoustic descriptors. The derived features outperform state-of-the-art approaches over multiple databases. Cross-corpus studies are conducted to determine the ability of these features to generalize well across different databases. The proposed method is also applied to derive features from facial expressions; a multi-modal fusion overcomes the deficiencies of a speech only approach and further improves the recognition performance.
Besides affecting the acoustic properties of speech, emotions have a strong influence over speech articulation kinematics. A learning approach, which constrains a classifier trained over acoustic descriptors, to also model articulatory data is proposed here. This method requires articulatory information only during the training stage, thus overcoming the challenges inherent to large-scale data collection, while simultaneously exploiting the correlations between articulation kinematics and acoustic descriptors to improve the accuracy of emotion recognition systems.
Identifying context from ambient sounds in a lifelogging scenario requires feature extraction, segmentation and annotation techniques capable of efficiently handling long duration audio recordings; a complete framework for such applications is presented. The performance is evaluated on real world data and accompanied by a prototypical Android-based user interface.
The proposed methods are also assessed in terms of computation and implementation complexity. Software and field programmable gate array based implementations are considered for emotion recognition, while virtual platforms are used to model the complexities of lifelogging. The derived metrics are used to determine the feasibility of these methods for applications requiring real-time capabilities and low power consumption.
Human activity recognition is the task of identifying a person’s movement from sensors in a wearable device, such as a smartphone, smartwatch, or a medical-grade device. A great method for this task is machine learning, which is the study of algorithms that learn and improve on their own with the help of massive amounts of useful data. These classification models can accurately classify activities with the time-series data from accelerometers and gyroscopes. A significant way to improve the accuracy of these machine learning models is preprocessing the data, essentially augmenting data to make the identification of each activity, or class, easier for the model. <br/>On this topic, this paper explains the design of SigNorm, a new web application which lets users conveniently transform time-series data and view the effects of those transformations in a code-free, browser-based user interface. The second and final section explains my take on a human activity recognition problem, which involves comparing a preprocessed dataset to an un-augmented one, and comparing the differences in accuracy using a one-dimensional convolutional neural network to make classifications.