Matching Items (2)
158066-Thumbnail Image.png
Description
Recently, a well-designed and well-trained neural network can yield state-of-the-art results across many domains, including data mining, computer vision, and medical image analysis. But progress has been limited for tasks where labels are difficult or impossible to obtain. This reliance on exhaustive labeling is a critical limitation in the rapid

Recently, a well-designed and well-trained neural network can yield state-of-the-art results across many domains, including data mining, computer vision, and medical image analysis. But progress has been limited for tasks where labels are difficult or impossible to obtain. This reliance on exhaustive labeling is a critical limitation in the rapid deployment of neural networks. Besides, the current research scales poorly to a large number of unseen concepts and is passively spoon-fed with data and supervision.

To overcome the above data scarcity and generalization issues, in my dissertation, I first propose two unsupervised conventional machine learning algorithms, hyperbolic stochastic coding, and multi-resemble multi-target low-rank coding, to solve the incomplete data and missing label problem. I further introduce a deep multi-domain adaptation network to leverage the power of deep learning by transferring the rich knowledge from a large-amount labeled source dataset. I also invent a novel time-sequence dynamically hierarchical network that adaptively simplifies the network to cope with the scarce data.

To learn a large number of unseen concepts, lifelong machine learning enjoys many advantages, including abstracting knowledge from prior learning and using the experience to help future learning, regardless of how much data is currently available. Incorporating this capability and making it versatile, I propose deep multi-task weight consolidation to accumulate knowledge continuously and significantly reduce data requirements in a variety of domains. Inspired by the recent breakthroughs in automatically learning suitable neural network architectures (AutoML), I develop a nonexpansive AutoML framework to train an online model without the abundance of labeled data. This work automatically expands the network to increase model capability when necessary, then compresses the model to maintain the model efficiency.

In my current ongoing work, I propose an alternative method of supervised learning that does not require direct labels. This could utilize various supervision from an image/object as a target value for supervising the target tasks without labels, and it turns out to be surprisingly effective. The proposed method only requires few-shot labeled data to train, and can self-supervised learn the information it needs and generalize to datasets not seen during training.
ContributorsZhang, Jie (Author) / Wang, Yalin (Thesis advisor) / Liu, Huan (Committee member) / Stonnington, Cynthia (Committee member) / Liang, Jianming (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2020
158769-Thumbnail Image.png
Description
The rapid advancement of Deep Neural Networks (DNNs), computing, and sensing technology has enabled many new applications, such as the self-driving vehicle, the surveillance drone, and the robotic system. Compared to conventional edge devices (e.g. cell phone or smart home devices), these emerging devices are required to deal with much

The rapid advancement of Deep Neural Networks (DNNs), computing, and sensing technology has enabled many new applications, such as the self-driving vehicle, the surveillance drone, and the robotic system. Compared to conventional edge devices (e.g. cell phone or smart home devices), these emerging devices are required to deal with much more complicated and dynamic situations in real-time with bounded computation resources. However, there are several challenges, including but not limited to efficiency, real-time adaptation, model stability, and automation of architecture design.

To tackle the challenges mentioned above, model plasticity and stability are leveraged to achieve efficient and online deep learning, especially in the scenario of learning streaming data at the edge:

First, a dynamic training scheme named Continuous Growth and Pruning (CGaP) is proposed to compress the DNNs through growing important parameters and pruning unimportant ones, achieving up to 98.1% reduction in the number of parameters.

Second, this dissertation presents Progressive Segmented Training (PST), which targets catastrophic forgetting problems in continual learning through importance sampling, model segmentation, and memory-assisted balancing. PST achieves state-of-the-art accuracy with 1.5X FLOPs reduction in the complete inference path.

Third, to facilitate online learning in real applications, acquisitive learning (AL) is further proposed to emphasize both knowledge inheritance and acquisition: the majority of the knowledge is first pre-trained in the inherited model and then adapted to acquire new knowledge. The inherited model's stability is monitored by noise injection and the landscape of the loss function, while the acquisition is realized by importance sampling and model segmentation. Compared to a conventional scheme, AL reduces accuracy drop by >10X on CIFAR-100 dataset, with 5X reduction in latency per training image and 150X reduction in training FLOPs.

Finally, this dissertation presents evolutionary neural architecture search in light of model stability (ENAS-S). ENAS-S uses a novel fitness score, which addresses not only the accuracy but also the model stability, to search for an optimal inherited model for the application of continual learning. ENAS-S outperforms hand-designed DNNs when learning from a data stream at the edge.

In summary, in this dissertation, several algorithms exploiting model plasticity and model stability are presented to improve the efficiency and accuracy of deep neural networks, especially for the scenario of continual learning.
ContributorsDu, Xiaocong (Author) / Cao, Yu (Thesis advisor) / Seo, Jae-Sun (Committee member) / Chakrabarti, Chaitali (Committee member) / Fan, Deliang (Committee member) / Arizona State University (Publisher)
Created2020