This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.

Displaying 1 - 3 of 3
Filtering by

Clear all filters

168677-Thumbnail Image.png
Description

This work addresses the following four problems: (i) Will a blockage occur in the near future? (ii) When will this blockage occur? (iii) What is the type of the blockage? And (iv) what is the direction of the moving blockage? The proposed solution utilizes deep neural networks (DNN) as well

This work addresses the following four problems: (i) Will a blockage occur in the near future? (ii) When will this blockage occur? (iii) What is the type of the blockage? And (iv) what is the direction of the moving blockage? The proposed solution utilizes deep neural networks (DNN) as well as non-machine learning (ML) algorithms. At the heart of the proposed method is identification of special patterns of received signal and sensory data before the blockage occurs (\textit{pre-blockage signatures}) and to infer future blockages utilizing these signatures. To evaluate the proposed approach, first real-world datasets are built for both in-band mmWave system and LiDAR-aided in mmWave systems based on the DeepSense 6G structure. In particular, for in-band mmWave system, two real-world datasets are constructed -- one for indoor scenario and the other for outdoor scenario. Then DNN models are developed to proactively predict the incoming blockages for both scenarios. For LiDAR-aided blockage prediction, a large-scale real-world dataset that includes co-existing LiDAR and mmWave communication measurements is constructed for outdoor scenarios. Then, an efficient LiDAR data denoising (static cluster removal) algorithm is designed to clear the dataset noise. Finally, a non-ML method and a DNN model that proactively predict dynamic link blockages are developed. Experiments using in-band mmWave datasets show that, the proposed approach can successfully predict the occurrence of future dynamic blockages (up to 5 s) with more than 80% accuracy (indoor scenario). Further, for the outdoor scenario with highly-mobile vehicular blockages, the proposed model can predict the exact time of the future blockage with less than 100 ms error for blockages happening within the future 600 ms. Further, our proposed method can predict the size and moving direction of the blockages. For the co-existing LiDAR and mmWave real-world dataset, our LiDAR-aided approach is shown to achieve above 95% accuracy in predicting blockages occurring within 100 ms and more than 80% prediction accuracy for blockages occurring within one second. Further, for the outdoor scenario with highly-mobile vehicular blockages, the proposed model can predict the exact time of the future blockage with less than 150 ms error for blockages happening within one second. In addition, our method achieves above 92% accuracy to classify the type of blockages and above 90% accuracy predicting the blockage moving direction. The proposed solutions can potentially provide an order of magnitude saving in the network latency, thereby highlighting a promising approach for addressing the blockage challenges in mmWave/sub-THz networks.

ContributorsWu, Shunyao (Author) / Chakrabarti, Chaitali CC (Thesis advisor) / Alkhateeb, Ahmed AA (Committee member) / Bliss, Daniel DB (Committee member) / Papandreou-Suppappola, Antonia AP (Committee member) / Arizona State University (Publisher)
Created2022
155085-Thumbnail Image.png
Description
High-level inference tasks in video applications such as recognition, video retrieval, and zero-shot classification have become an active research area in recent years. One fundamental requirement for such applications is to extract high-quality features that maintain high-level information in the videos.

Many video feature extraction algorithms have been purposed, such

High-level inference tasks in video applications such as recognition, video retrieval, and zero-shot classification have become an active research area in recent years. One fundamental requirement for such applications is to extract high-quality features that maintain high-level information in the videos.

Many video feature extraction algorithms have been purposed, such as STIP, HOG3D, and Dense Trajectories. These algorithms are often referred to as “handcrafted” features as they were deliberately designed based on some reasonable considerations. However, these algorithms may fail when dealing with high-level tasks or complex scene videos. Due to the success of using deep convolution neural networks (CNNs) to extract global representations for static images, researchers have been using similar techniques to tackle video contents. Typical techniques first extract spatial features by processing raw images using deep convolution architectures designed for static image classifications. Then simple average, concatenation or classifier-based fusion/pooling methods are applied to the extracted features. I argue that features extracted in such ways do not acquire enough representative information since videos, unlike images, should be characterized as a temporal sequence of semantically coherent visual contents and thus need to be represented in a manner considering both semantic and spatio-temporal information.

In this thesis, I propose a novel architecture to learn semantic spatio-temporal embedding for videos to support high-level video analysis. The proposed method encodes video spatial and temporal information separately by employing a deep architecture consisting of two channels of convolutional neural networks (capturing appearance and local motion) followed by their corresponding Fully Connected Gated Recurrent Unit (FC-GRU) encoders for capturing longer-term temporal structure of the CNN features. The resultant spatio-temporal representation (a vector) is used to learn a mapping via a Fully Connected Multilayer Perceptron (FC-MLP) to the word2vec semantic embedding space, leading to a semantic interpretation of the video vector that supports high-level analysis. I evaluate the usefulness and effectiveness of this new video representation by conducting experiments on action recognition, zero-shot video classification, and semantic video retrieval (word-to-video) retrieval, using the UCF101 action recognition dataset.
ContributorsHu, Sheng-Hung (Author) / Li, Baoxin (Thesis advisor) / Turaga, Pavan (Committee member) / Liang, Jianming (Committee member) / Tong, Hanghang (Committee member) / Arizona State University (Publisher)
Created2016
156468-Thumbnail Image.png
Description
With the emergence of edge computing paradigm, many applications such as image recognition and augmented reality require to perform machine learning (ML) and artificial intelligence (AI) tasks on edge devices. Most AI and ML models are large and computational heavy, whereas edge devices are usually equipped with limited computational and

With the emergence of edge computing paradigm, many applications such as image recognition and augmented reality require to perform machine learning (ML) and artificial intelligence (AI) tasks on edge devices. Most AI and ML models are large and computational heavy, whereas edge devices are usually equipped with limited computational and storage resources. Such models can be compressed and reduced in order to be placed on edge devices, but they may loose their capability and may not generalize and perform well compared to large models. Recent works used knowledge transfer techniques to transfer information from a large network (termed teacher) to a small one (termed student) in order to improve the performance of the latter. This approach seems to be promising for learning on edge devices, but a thorough investigation on its effectiveness is lacking.

The purpose of this work is to provide an extensive study on the performance (both in terms of accuracy and convergence speed) of knowledge transfer, considering different student-teacher architectures, datasets and different techniques for transferring knowledge from teacher to student.

A good performance improvement is obtained by transferring knowledge from both the intermediate layers and last layer of the teacher to a shallower student. But other architectures and transfer techniques do not fare so well and some of them even lead to negative performance impact. For example, a smaller and shorter network, trained with knowledge transfer on Caltech 101 achieved a significant improvement of 7.36\% in the accuracy and converges 16 times faster compared to the same network trained without knowledge transfer. On the other hand, smaller network which is thinner than the teacher network performed worse with an accuracy drop of 9.48\% on Caltech 101, even with utilization of knowledge transfer.
ContributorsSistla, Ragini (Author) / Zhao, Ming (Thesis advisor, Committee member) / Li, Baoxin (Committee member) / Tong, Hanghang (Committee member) / Arizona State University (Publisher)
Created2018