Search Content

Applying a Novel Integrated Persistent Feature to Understand Topographical Network Connectivity in Older Adults with Autism Spectrum Disorder

Description

Autism spectrum disorder (ASD) is a developmental neuropsychiatric condition with early childhood onset, thus most research has focused on characterizing brain function in young individuals. Little is understood about brain function differences in middle age and older adults with ASD, despite evidence of persistent and worsening cognitive symptoms. Functional Magnetic…

Autism spectrum disorder (ASD) is a developmental neuropsychiatric condition with early childhood onset, thus most research has focused on characterizing brain function in young individuals. Little is understood about brain function differences in middle age and older adults with ASD, despite evidence of persistent and worsening cognitive symptoms. Functional Magnetic Resonance Imaging (MRI) in younger persons with ASD demonstrate that large-scale brain networks containing the prefrontal cortex are affected. A novel, threshold-selection-free graph theory metric is proposed as a more robust and sensitive method for tracking brain aging in ASD and is compared against five well-accepted graph theoretical analysis methods in older men with ASD and matched neurotypical (NT) participants. Participants were 27 men with ASD (52 +/- 8.4 years) and 21 NT men (49.7 +/- 6.5 years). Resting-state functional MRI (rs-fMRI) scans were collected for six minutes (repetition time=3s) with eyes closed. Data was preprocessed in SPM12, and Data Processing Assistant for Resting-State fMRI (DPARSF) was used to extract 116 regions-of-interest defined by the automated anatomical labeling (AAL) atlas. AAL regions were separated into six large-scale brain networks. This proposed metric is the slope of a monotonically decreasing convergence function (Integrated Persistent Feature, IPF; Slope of the IPF, SIP). Results were analyzed in SPSS using ANCOVA, with IQ as a covariate. A reduced SIP was in older men with ASD, compared to NT men, in the Default Mode Network [F(1,47)=6.48; p=0.02; 2=0.13] and Executive Network [F(1,47)=4.40; p=0.04; 2=0.09], a trend in the Fronto-Parietal Network [F(1,47)=3.36; p=0.07; 2=0.07]. There were no differences in the non-prefrontal networks (Sensory motor network, auditory network, and medial visual network). The only other graph theory metric to reach significance was network diameter in the Default Mode Network [F(1,47)=4.31; p=0.04; 2=0.09]; however, the effect size for the SIP was stronger. Modularity, Betti number, characteristic path length, and eigenvalue centrality were all non-significant. These results provide empirical evidence of decreased functional network integration in pre-frontal networks of older adults with ASD and propose a useful biomarker for tracking prognosis of aging adults with ASD to enable more informed treatment, support, and care methods for this growing population.

ContributorsCatchings, Michael Thomas (Author) / Braden, Brittany B (Thesis advisor) / Greger, Bradley (Thesis advisor) / Schaefer, Sydney (Committee member) / Arizona State University (Publisher)

Created2019

Localized Application for Video Capture for a Multimedia Sensor Node with Name-Based Segment Streaming

Description

The Internet of Things (IoT) has become a more pervasive part of everyday life. IoT networks such as wireless sensor networks, depend greatly on the limiting unnecessary power consumption. As such, providing low-power, adaptable software can greatly improve network design. For streaming live video content, Wireless Video Sensor Network Platform…

The Internet of Things (IoT) has become a more pervasive part of everyday life. IoT networks such as wireless sensor networks, depend greatly on the limiting unnecessary power consumption. As such, providing low-power, adaptable software can greatly improve network design. For streaming live video content, Wireless Video Sensor Network Platform compatible Dynamic Adaptive Streaming over HTTP (WVSNP-DASH) aims to revolutionize wireless segmented video streaming by providing a low-power, adaptable framework to compete with modern DASH players such as Moving Picture Experts Group (MPEG-DASH) and Apple’s Hypertext Transfer Protocol (HTTP) Live Streaming (HLS). Each segment is independently playable, and does not depend on a manifest file, resulting in greatly improved power performance. My work was to show that WVSNP-DASH is capable of further power savings at the level of the wireless sensor node itself if a native capture program is implemented at the camera sensor node. I created a native capture program in the C language that fulfills the name-based segmentation requirements of WVSNP-DASH. I present this program with intent to measure its power consumption on a hardware test-bed in future. To my knowledge, this is the first program to generate WVSNP-DASH playable video segments. The results show that our program could be utilized by WVSNP-DASH, but there are issues with the efficiency, so provided are an additional outline for further improvements.

ContributorsKhan, Zarah (Author) / Reisslein, Martin (Thesis advisor) / Seema, Adolph (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Arizona State University (Publisher)

Created2018

Characterization of Energy and Performance Bottlenecks in an Omni-directional Camera System

Description

Generating real-world content for VR is challenging in terms of capturing and processing at high resolution and high frame-rates. The content needs to represent a truly immersive experience, where the user can look around in 360-degree view and perceive the depth of the scene. The existing solutions only capture and…

Generating real-world content for VR is challenging in terms of capturing and processing at high resolution and high frame-rates. The content needs to represent a truly immersive experience, where the user can look around in 360-degree view and perceive the depth of the scene. The existing solutions only capture and offload the compute load to the server. But offloading large amounts of raw camera feeds takes longer latencies and poses difficulties for real-time applications. By capturing and computing on the edge, we can closely integrate the systems and optimize for low latency. However, moving the traditional stitching algorithms to battery constrained device needs at least three orders of magnitude reduction in power. We believe that close integration of capture and compute stages will lead to reduced overall system power.

We approach the problem by building a hardware prototype and characterize the end-to-end system bottlenecks of power and performance. The prototype has 6 IMX274 cameras and uses Nvidia Jetson TX2 development board for capture and computation. We found that capturing is bottlenecked by sensor power and data-rates across interfaces, whereas compute is limited by the total number of computations per frame. Our characterization shows that redundant capture and redundant computations lead to high power, huge memory footprint, and high latency. The existing systems lack hardware-software co-design aspects, leading to excessive data transfers across the interfaces and expensive computations within the individual subsystems. Finally, we propose mechanisms to optimize the system for low power and low latency. We emphasize the importance of co-design of different subsystems to reduce and reuse the data. For example, reusing the motion vectors of the ISP stage reduces the memory footprint of the stereo correspondence stage. Our estimates show that pipelining and parallelization on custom FPGA can achieve real time stitching.

ContributorsGunnam, Sridhar (Author) / LiKamWa, Robert (Thesis advisor) / Turaga, Pavan (Committee member) / Jayasuriya, Suren (Committee member) / Arizona State University (Publisher)

Created2018

Joint Optimization of Quantization and Structured Sparsity for Compressed Deep Neural Networks

Description

Deep neural networks (DNN) have shown tremendous success in various cognitive tasks, such as image classification, speech recognition, etc. However, their usage on resource-constrained edge devices has been limited due to high computation and large memory requirement.

To overcome these challenges, recent works have extensively investigated model compression techniques such…

Deep neural networks (DNN) have shown tremendous success in various cognitive tasks, such as image classification, speech recognition, etc. However, their usage on resource-constrained edge devices has been limited due to high computation and large memory requirement.

To overcome these challenges, recent works have extensively investigated model compression techniques such as element-wise sparsity, structured sparsity and quantization. While most of these works have applied these compression techniques in isolation, there have been very few studies on application of quantization and structured sparsity together on a DNN model.

This thesis co-optimizes structured sparsity and quantization constraints on DNN models during training. Specifically, it obtains optimal setting of 2-bit weight and 2-bit activation coupled with 4X structured compression by performing combined exploration of quantization and structured compression settings. The optimal DNN model achieves 50X weight memory reduction compared to floating-point uncompressed DNN. This memory saving is significant since applying only structured sparsity constraints achieves 2X memory savings and only quantization constraints achieves 16X memory savings. The algorithm has been validated on both high and low capacity DNNs and on wide-sparse and deep-sparse DNN models. Experiments demonstrated that deep-sparse DNN outperforms shallow-dense DNN with varying level of memory savings depending on DNN precision and sparsity levels. This work further proposed a Pareto-optimal approach to systematically extract optimal DNN models from a huge set of sparse and dense DNN models. The resulting 11 optimal designs were further evaluated by considering overall DNN memory which includes activation memory and weight memory. It was found that there is only a small change in the memory footprint of the optimal designs corresponding to the low sparsity DNNs. However, activation memory cannot be ignored for high sparsity DNNs.

ContributorsSrivastava, Gaurav (Author) / Seo, Jae-Sun (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Berisha, Visar (Committee member) / Arizona State University (Publisher)

Created2018

Stagioni: Temperature management to enable near-sensor processing for performance, fidelity, and energy-efficiency of vision and imaging workloads

Description

Vision processing on traditional architectures is inefficient due to energy-expensive off-chip data movements. Many researchers advocate pushing processing close to the sensor to substantially reduce data movements. However, continuous near-sensor processing raises the sensor temperature, impairing the fidelity of imaging/vision tasks.

The work characterizes the thermal implications of using 3D stacked…

Vision processing on traditional architectures is inefficient due to energy-expensive off-chip data movements. Many researchers advocate pushing processing close to the sensor to substantially reduce data movements. However, continuous near-sensor processing raises the sensor temperature, impairing the fidelity of imaging/vision tasks.

The work characterizes the thermal implications of using 3D stacked image sensors with near-sensor vision processing units. The characterization reveals that near-sensor processing reduces system power but degrades image quality. For reasonable image fidelity, the sensor temperature needs to stay below a threshold, situationally determined by application needs. Fortunately, the characterization also identifies opportunities -- unique to the needs of near-sensor processing -- to regulate temperature based on dynamic visual task requirements and rapidly increase capture quality on demand.

Based on the characterization, the work proposes and investigate two thermal management strategies -- stop-capture-go and seasonal migration -- for imaging-aware thermal management. The work present parameters that govern the policy decisions and explore the trade-offs between system power and policy overhead. The work's evaluation shows that the novel dynamic thermal management strategies can unlock the energy-efficiency potential of near-sensor processing with minimal performance impact, without compromising image fidelity.

ContributorsKodukula, Venkatesh (Author) / LiKamWa, Robert (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Brunhaver, John (Committee member) / Arizona State University (Publisher)

Created2019

Energy Efficient Hardware Design of Neural Networks

Description

Hardware implementation of deep neural networks is earning significant importance nowadays. Deep neural networks are mathematical models that use learning algorithms inspired by the brain. Numerous deep learning algorithms such as multi-layer perceptrons (MLP) have demonstrated human-level recognition accuracy in image and speech classification tasks. Multiple layers of processing elements…

Hardware implementation of deep neural networks is earning significant importance nowadays. Deep neural networks are mathematical models that use learning algorithms inspired by the brain. Numerous deep learning algorithms such as multi-layer perceptrons (MLP) have demonstrated human-level recognition accuracy in image and speech classification tasks. Multiple layers of processing elements called neurons with several connections between them called synapses are used to build these networks. Hence, it involves operations that exhibit a high level of parallelism making it computationally and memory intensive. Constrained by computing resources and memory, most of the applications require a neural network which utilizes less energy. Energy efficient implementation of these computationally intense algorithms on neuromorphic hardware demands a lot of architectural optimizations. One of these optimizations would be the reduction in the network size using compression and several studies investigated compression by introducing element-wise or row-/column-/block-wise sparsity via pruning and regularization. Additionally, numerous recent works have concentrated on reducing the precision of activations and weights with some reducing to a single bit. However, combining various sparsity structures with binarized or very-low-precision (2-3 bit) neural networks have not been comprehensively explored. Output activations in these deep neural network algorithms are habitually non-binary making it difficult to exploit sparsity. On the other hand, biologically realistic models like spiking neural networks (SNN) closely mimic the operations in biological nervous systems and explore new avenues for brain-like cognitive computing. These networks deal with binary spikes, and they can exploit the input-dependent sparsity or redundancy to dynamically scale the amount of computation in turn leading to energy-efficient hardware implementation. This work discusses configurable spiking neuromorphic architecture that supports multiple hidden layers exploiting hardware reuse. It also presents design techniques for minimum-area/-energy DNN hardware with minimal degradation in accuracy. Area, performance and energy results of these DNN and SNN hardware is reported for the MNIST dataset. The Neuromorphic hardware designed for SNN algorithm in 28nm CMOS demonstrates high classification accuracy (>98% on MNIST) and low energy (51.4 - 773 (nJ) per classification). The optimized DNN hardware designed in 40nm CMOS that combines 8X structured compression and 3-bit weight precision showed 98.4% accuracy at 33 (nJ) per classification.

ContributorsKolala Venkataramanaiah, Shreyas (Author) / Seo, Jae-Sun (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Cao, Yu (Committee member) / Arizona State University (Publisher)

Created2018

Concurrent Checkpointing for Embedded Real-Time Systems

Description

The Internet of Things ecosystem has spawned a wide variety of embedded real-time systems that complicate the identification and resolution of bugs in software. The methods of concurrent checkpoint provide a means to monitor the application state with the ability to replay the execution on like hardware and software,…

The Internet of Things ecosystem has spawned a wide variety of embedded real-time systems that complicate the identification and resolution of bugs in software. The methods of concurrent checkpoint provide a means to monitor the application state with the ability to replay the execution on like hardware and software, without holding off and delaying the execution of application threads. In this thesis, it is accomplished by monitoring physical memory of the application using a soft-dirty page tracker and measuring the various types of overhead when employing concurrent checkpointing. The solution presented is an advancement of the Checkpoint and Replay In Userspace (CRIU) thereby eliminating the large stalls and parasitic operation for each successive checkpoint. Impact and performance is measured using the Parsec 3.0 Benchmark suite and 4.11.12-rt16+ Linux kernel on a MinnowBoard Turbot Quad-Core board.

ContributorsPrinke, Michael L (Author) / Lee, Yann-Hang (Thesis advisor) / Shrivastava, Aviral (Committee member) / Zhao, Ming (Committee member) / Arizona State University (Publisher)

Created2018

Predicting and Interpreting Students Performance using Supervised Learning and Shapley Additive Explanations

Description

Due to large data resources generated by online educational applications, Educational Data Mining (EDM) has improved learning effects in different ways: Students Visualization, Recommendations for students, Students Modeling, Grouping Students, etc. A lot of programming assignments have the features like automating submissions, examining the test cases to verify the correctness,…

Due to large data resources generated by online educational applications, Educational Data Mining (EDM) has improved learning effects in different ways: Students Visualization, Recommendations for students, Students Modeling, Grouping Students, etc. A lot of programming assignments have the features like automating submissions, examining the test cases to verify the correctness, but limited studies compared different statistical techniques with latest frameworks, and interpreted models in a unified approach.

In this thesis, several data mining algorithms have been applied to analyze students’ code assignment submission data from a real classroom study. The goal of this work is to explore

and predict students’ performances. Multiple machine learning models and the model accuracy were evaluated based on the Shapley Additive Explanation.

The Cross-Validation shows the Gradient Boosting Decision Tree has the best precision 85.93% with average 82.90%. Features like Component grade, Due Date, Submission Times have higher impact than others. Baseline model received lower precision due to lack of non-linear fitting.

ContributorsTian, Wenbo (Author) / Hsiao, Ihan (Thesis advisor) / Bazzi, Rida (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2019

Modeling and measuring cognitive load to reduce driver distraction in smart cars

Description

Driver distraction research has a long history spanning nearly 50 years, intensifying in the last decade. The focus has always been on identifying the distractive tasks and measuring the respective harm level. As in-vehicle technology advances, the list of distractive activities grows along with crash risk. Additionally, the distractive activities…

Driver distraction research has a long history spanning nearly 50 years, intensifying in the last decade. The focus has always been on identifying the distractive tasks and measuring the respective harm level. As in-vehicle technology advances, the list of distractive activities grows along with crash risk. Additionally, the distractive activities become more common and complicated, especially with regard to In-Car Interactive System. This work's main focus is on driver distraction caused by the in-car interactive System. There have been many User Interaction Designs (Buttons, Speech, Visual) for Human-Car communication, in the past and currently present. And, all related studies suggest that driver distraction level is still high and there is a need for a better design. Multimodal Interaction is a design approach, which relies on using multiple modes for humans to interact with the car & hence reducing driver distraction by allowing the driver to choose the most suitable mode with minimum distraction. Additionally, combining multiple modes simultaneously provides more natural interaction, which could lead to less distraction. The main goal of MMI is to enable the driver to be more attentive to driving tasks and spend less time fiddling with distractive tasks. Engineering based method is used to measure driver distraction. This method uses metrics like Reaction time, Acceleration, Lane Departure obtained from test cases.

ContributorsJahagirdar, Tanvi (Author) / Gaffar, Ashraf (Thesis advisor) / Ghazarian, Arbi (Committee member) / Gray, Robert (Committee member) / Arizona State University (Publisher)

Created2015

Neutron-gamma ray discrimination using normalized cross correlation

Description

The reduced availability of 3He is a motivation for developing alternative neutron detectors. 6Li-enriched CLYC (Cs2LiYCl6), a scintillator, is a promising candidate to replace 3He. The neutron and gamma ray signals from CLYC have different shapes due to the slower decay of neutron pulses. Some of the well-known pulse shape…

The reduced availability of 3He is a motivation for developing alternative neutron detectors. 6Li-enriched CLYC (Cs2LiYCl6), a scintillator, is a promising candidate to replace 3He. The neutron and gamma ray signals from CLYC have different shapes due to the slower decay of neutron pulses. Some of the well-known pulse shape discrimination techniques are charge comparison method, pulse gradient method and frequency gradient method. In the work presented here, we have applied a normalized cross correlation (NCC) approach to real neutron and gamma ray pulses produced by exposing CLYC scintillators to a mixed radiation environment generated by 137Cs, 22Na, 57Co and 252Cf/AmBe at different event rates. The cross correlation analysis produces distinctive results for measured neutron pulses and gamma ray pulses when they are cross correlated with reference neutron and/or gamma templates. NCC produces good separation between neutron and gamma rays at low (< 100 kHz) to mid event rate (< 200 kHz). However, the separation disappears at high event rate (> 200 kHz) because of pileup, noise and baseline shift. This is also confirmed by observing the pulse shape discrimination (PSD) plots and figure of merit (FOM) of NCC. FOM is close to 3, which is good, for low event rate but rolls off significantly along with the increase in the event rate and reaches 1 at high event rate. Future efforts are required to reduce the noise by using better hardware system, remove pileup and detect the NCC shapes of neutron and gamma rays using advanced techniques.

ContributorsChandhran, Premkumar (Author) / Holbert, Keith E. (Thesis advisor) / Spanias, Andreas (Committee member) / Ogras, Umit Y. (Committee member) / Arizona State University (Publisher)

Created2015

Filtering by