Search Content

Stochastic optimization and real-time scheduling in cyber-physical systems

Description

A principal goal of this dissertation is to study stochastic optimization and real-time scheduling in cyber-physical systems (CPSs) ranging from real-time wireless systems to energy systems to distributed control systems. Under this common theme, this dissertation can be broadly organized into three parts based on the system environments. The first…

A principal goal of this dissertation is to study stochastic optimization and real-time scheduling in cyber-physical systems (CPSs) ranging from real-time wireless systems to energy systems to distributed control systems. Under this common theme, this dissertation can be broadly organized into three parts based on the system environments. The first part investigates stochastic optimization in real-time wireless systems, with the focus on the deadline-aware scheduling for real-time traffic. The optimal solution to such scheduling problems requires to explicitly taking into account the coupling in the deadline-aware transmissions and stochastic characteristics of the traffic, which involves a dynamic program that is traditionally known to be intractable or computationally expensive to implement. First, real-time scheduling with adaptive network coding over memoryless channels is studied, and a polynomial-time complexity algorithm is developed to characterize the optimal real-time scheduling. Then, real-time scheduling over Markovian channels is investigated, where channel conditions are time-varying and online channel learning is necessary, and the optimal scheduling policies in different traffic regimes are studied. The second part focuses on the stochastic optimization and real-time scheduling involved in energy systems. First, risk-aware scheduling and dispatch for plug-in electric vehicles (EVs) are studied, aiming to jointly optimize the EV charging cost and the risk of the load mismatch between the forecasted and the actual EV loads, due to the random driving activities of EVs. Then, the integration of wind generation at high penetration levels into bulk power grids is considered. Joint optimization of economic dispatch and interruptible load management is investigated using short-term wind farm generation forecast. The third part studies stochastic optimization in distributed control systems under different network environments. First, distributed spectrum access in cognitive radio networks is investigated by using pricing approach, where primary users (PUs) sell the temporarily unused spectrum and secondary users compete via random access for such spectrum opportunities. The optimal pricing strategy for PUs and the corresponding distributed implementation of spectrum access control are developed to maximize the PU's revenue. Then, a systematic study of the nonconvex utility-based power control problem is presented under the physical interference model in ad-hoc networks. Distributed power control schemes are devised to maximize the system utility, by leveraging the extended duality theory and simulated annealing.

ContributorsYang, Lei (Author) / Zhang, Junshan (Thesis advisor) / Tepedelenlioğlu, Cihan (Committee member) / Xue, Guoliang (Committee member) / Ying, Lei (Committee member) / Arizona State University (Publisher)

Created2012

Quantitative evaluation of control flow based soft error protection mechanisms

Description

Rapid technology scaling, the main driver of the power and performance improvements of computing solutions, has also rendered our computing systems extremely susceptible to transient errors called soft errors. Among the arsenal of techniques to protect computation from soft errors, Control Flow Checking (CFC) based techniques have gained a reputation…

Rapid technology scaling, the main driver of the power and performance improvements of computing solutions, has also rendered our computing systems extremely susceptible to transient errors called soft errors. Among the arsenal of techniques to protect computation from soft errors, Control Flow Checking (CFC) based techniques have gained a reputation of effective, yet low-cost protection mechanism. The basic idea is that, there is a high probability that a soft-fault in program execution will eventually alter the control flow of the program. Therefore just by making sure that the control flow of the program is correct, significant protection can be achieved. More than a dozen techniques for CFC have been developed over the last several decades, ranging from hardware techniques, software techniques, and hardware-software hybrid techniques as well. Our analysis shows that existing CFC techniques are not only ineffective in protecting from soft errors, but cause additional power and performance overheads. For this analysis, we develop and validate a simulation based experimental setup to accurately and quantitatively estimate the architectural vulnerability of a program execution on a processor micro-architecture. We model the protection achieved by various state-of-the-art CFC techniques in this quantitative vulnerability estimation setup, and find out that software only CFC protection schemes (CFCSS, CFCSS+NA, CEDA) increase system vulnerability by 18% to 21% with 17% to 38% performance overhead. Hybrid CFC protection (CFEDC) increases vulnerability by 5%, while the vulnerability remains almost the same for hardware only CFC protection (CFCET); notwithstanding the hardware overheads of design cost, area, and power incurred in the hardware modifications required for their implementations.

ContributorsRhisheekesan, Abhishek (Author) / Shrivastava, Aviral (Thesis advisor) / Colbourn, Charles Joseph (Committee member) / Wu, Carole-Jean (Committee member) / Arizona State University (Publisher)

Created2013

Ensuring safety of model-based generated code for pervasive health monitoring systems

Description

Wireless technologies for health monitoring systems have seen considerable interest in recent years owing to it's potential to achieve vision of pervasive healthcare, that is healthcare to anyone, anywhere and anytime. Development of wearable wireless medical devices which have the capability to sense, compute, and send physiological information to a…

Wireless technologies for health monitoring systems have seen considerable interest in recent years owing to it's potential to achieve vision of pervasive healthcare, that is healthcare to anyone, anywhere and anytime. Development of wearable wireless medical devices which have the capability to sense, compute, and send physiological information to a mobile gateway, forming a Body Sensor Network (BSN) is considered as a step towards achieving the vision of pervasive health monitoring systems (PHMS). PHMS consisting of wearable body sensors encourages unsupervised long-term monitoring, reducing frequent visit to hospital and nursing cost. Therefore, it is of utmost importance that operation of PHMS must be reliable, safe and have longer lifetime. A model-based automatic code generation provides a state-of-art code generation of sensor and smart phone code from high-level specification of a PHMS. Code generator intakes meta-model of PHMS specification, uses codebase containing code templates and algorithms, and generates platform specific code. Health-Dev, a framework for model-based development of PHMS, uses code generation to implement PHMS in sensor and smart phone. As a part of this thesis, model-based automatic code generation was evaluated and experimentally validated. The generated code was found to be safe in terms of ensuring no race condition, array, or pointer related errors in the generated code and more optimized as compared to hand-written BSN benchmark code in terms of lesser unreachable code.

ContributorsVerma, Sunit (Author) / Gupta, Sandeep (Thesis advisor) / Tepedelenlioğlu, Cihan (Committee member) / Reisslein, Martin (Committee member) / Arizona State University (Publisher)

Created2013

Adaptive filter bank time-frequency representations

Description

A signal with time-varying frequency content can often be expressed more clearly using a time-frequency representation (TFR), which maps the signal into a two-dimensional function of time and frequency, similar to musical notation. The thesis reviews one of the most commonly used TFRs, the Wigner distribution (WD), and discusses its…

A signal with time-varying frequency content can often be expressed more clearly using a time-frequency representation (TFR), which maps the signal into a two-dimensional function of time and frequency, similar to musical notation. The thesis reviews one of the most commonly used TFRs, the Wigner distribution (WD), and discusses its application in Fourier optics: it is shown that the WD is analogous to the spectral dispersion that results from a diffraction grating, and time and frequency are similarly analogous to a one dimensional spatial coordinate and wavenumber. The grating is compared with a simple polychromator, which is a bank of optical filters. Another well-known TFR is the short time Fourier transform (STFT). Its discrete version can be shown to be equivalent to a filter bank, an array of bandpass filters that enable localized processing of the analysis signals in different sub-bands. This work proposes a signal-adaptive method of generating TFRs. In order to minimize distortion in analyzing a signal, the method modifies the filter bank to consist of non-overlapping rectangular bandpass filters generated using the Butterworth filter design process. The information contained in the resulting TFR can be used to reconstruct the signal, and perfect reconstruction techniques involving quadrature mirror filter banks are compared with a simple Fourier synthesis sum. The optimal filter parameters of the rectangular filters are selected adaptively by minimizing the mean-squared error (MSE) from a pseudo-reconstructed version of the analysis signal. The reconstruction MSE is proposed as an error metric for characterizing TFRs; a practical measure of the error requires normalization and cross correlation with the analysis signal. Simulations were performed to demonstrate the the effectiveness of the new adaptive TFR and its relation to swept-tuned spectrum analyzers.

ContributorsWeber, Peter C. (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Tepedelenlioğlu, Cihan (Committee member) / Kovvali, Narayan (Committee member) / Arizona State University (Publisher)

Created2012

System-level synthesis of dataplane subsystems for MPSoCs

Description

In recent years we have witnessed a shift towards multi-processor system-on-chips (MPSoCs) to address the demands of embedded devices (such as cell phones, GPS devices, luxury car features, etc.). Highly optimized MPSoCs are well-suited to tackle the complex application demands desired by the end user customer. These MPSoCs incorporate a…

In recent years we have witnessed a shift towards multi-processor system-on-chips (MPSoCs) to address the demands of embedded devices (such as cell phones, GPS devices, luxury car features, etc.). Highly optimized MPSoCs are well-suited to tackle the complex application demands desired by the end user customer. These MPSoCs incorporate a constellation of heterogeneous processing elements (PEs) (general purpose PEs and application-specific integrated circuits (ASICS)). A typical MPSoC will be composed of a application processor, such as an ARM Coretex-A9 with cache coherent memory hierarchy, and several application sub-systems. Each of these sub-systems are composed of highly optimized instruction processors, graphics/DSP processors, and custom hardware accelerators. Typically, these sub-systems utilize scratchpad memories (SPM) rather than support cache coherency. The overall architecture is an integration of the various sub-systems through a high bandwidth system-level interconnect (such as a Network-on-Chip (NoC)). The shift to MPSoCs has been fueled by three major factors: demand for high performance, the use of component libraries, and short design turn around time. As customers continue to desire more and more complex applications on their embedded devices the performance demand for these devices continues to increase. Designers have turned to using MPSoCs to address this demand. By using pre-made IP libraries designers can quickly piece together a MPSoC that will meet the application demands of the end user with minimal time spent designing new hardware. Additionally, the use of MPSoCs allows designers to generate new devices very quickly and thus reducing the time to market. In this work, a complete MPSoC synthesis design flow is presented. We first present a technique \cite{leary1_intro} to address the synthesis of the interconnect architecture (particularly Network-on-Chip (NoC)). We then address the synthesis of the memory architecture of a MPSoC sub-system \cite{leary2_intro}. Lastly, we present a co-synthesis technique to generate the functional and memory architectures simultaneously. The validity and quality of each synthesis technique is demonstrated through extensive experimentation.

ContributorsLeary, Glenn (Author) / Chatha, Karamvir S (Thesis advisor) / Vrudhula, Sarma (Committee member) / Shrivastava, Aviral (Committee member) / Beraha, Rudy (Committee member) / Arizona State University (Publisher)

Created2013

Distributed inference using bounded transmissions

Description

Distributed inference has applications in a wide range of fields such as source localization, target detection, environment monitoring, and healthcare. In this dissertation, distributed inference schemes which use bounded transmit power are considered. The performance of the proposed schemes are studied for a variety of inference problems. In the first…

Distributed inference has applications in a wide range of fields such as source localization, target detection, environment monitoring, and healthcare. In this dissertation, distributed inference schemes which use bounded transmit power are considered. The performance of the proposed schemes are studied for a variety of inference problems. In the first part of the dissertation, a distributed detection scheme where the sensors transmit with constant modulus signals over a Gaussian multiple access channel is considered. The deflection coefficient of the proposed scheme is shown to depend on the characteristic function of the sensing noise, and the error exponent for the system is derived using large deviation theory. Optimization of the deflection coefficient and error exponent are considered with respect to a transmission phase parameter for a variety of sensing noise distributions including impulsive ones. The proposed scheme is also favorably compared with existing amplify-and-forward (AF) and detect-and-forward (DF) schemes. The effect of fading is shown to be detrimental to the detection performance and simulations are provided to corroborate the analytical results. The second part of the dissertation studies a distributed inference scheme which uses bounded transmission functions over a Gaussian multiple access channel. The conditions on the transmission functions under which consistent estimation and reliable detection are possible is characterized. For the distributed estimation problem, an estimation scheme that uses bounded transmission functions is proved to be strongly consistent provided that the variance of the noise samples are bounded and that the transmission function is one-to-one. The proposed estimation scheme is compared with the amplify and forward technique and its robustness to impulsive sensing noise distributions is highlighted. It is also shown that bounded transmissions suffer from inconsistent estimates if the sensing noise variance goes to infinity. For the distributed detection problem, similar results are obtained by studying the deflection coefficient. Simulations corroborate our analytical results. In the third part of this dissertation, the problem of estimating the average of samples distributed at the nodes of a sensor network is considered. A distributed average consensus algorithm in which every sensor transmits with bounded peak power is proposed. In the presence of communication noise, it is shown that the nodes reach consensus asymptotically to a finite random variable whose expectation is the desired sample average of the initial observations with a variance that depends on the step size of the algorithm and the variance of the communication noise. The asymptotic performance is characterized by deriving the asymptotic covariance matrix using results from stochastic approximation theory. It is shown that using bounded transmissions results in slower convergence compared to the linear consensus algorithm based on the Laplacian heuristic. Simulations corroborate our analytical findings. Finally, a robust distributed average consensus algorithm in which every sensor performs a nonlinear processing at the receiver is proposed. It is shown that non-linearity at the receiver nodes makes the algorithm robust to a wide range of channel noise distributions including the impulsive ones. It is shown that the nodes reach consensus asymptotically and similar results are obtained as in the case of transmit non-linearity. Simulations corroborate our analytical findings and highlight the robustness of the proposed algorithm.

ContributorsDasarathan, Sivaraman (Author) / Tepedelenlioğlu, Cihan (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Reisslein, Martin (Committee member) / Goryll, Michael (Committee member) / Arizona State University (Publisher)

Created2013

Compiler and runtime for memory management on software managed manycore processors

Description

We are expecting hundreds of cores per chip in the near future. However, scaling the memory architecture in manycore architectures becomes a major challenge. Cache coherence provides a single image of memory at any time in execution to all the cores, yet coherent cache architectures are believed will not scale…

We are expecting hundreds of cores per chip in the near future. However, scaling the memory architecture in manycore architectures becomes a major challenge. Cache coherence provides a single image of memory at any time in execution to all the cores, yet coherent cache architectures are believed will not scale to hundreds and thousands of cores. In addition, caches and coherence logic already take 20-50% of the total power consumption of the processor and 30-60% of die area. Therefore, a more scalable architecture is needed for manycore architectures. Software Managed Manycore (SMM) architectures emerge as a solution. They have scalable memory design in which each core has direct access to only its local scratchpad memory, and any data transfers to/from other memories must be done explicitly in the application using Direct Memory Access (DMA) commands. Lack of automatic memory management in the hardware makes such architectures extremely power-efficient, but they also become difficult to program. If the code/data of the task mapped onto a core cannot fit in the local scratchpad memory, then DMA calls must be added to bring in the code/data before it is required, and it may need to be evicted after its use. However, doing this adds a lot of complexity to the programmer's job. Now programmers must worry about data management, on top of worrying about the functional correctness of the program - which is already quite complex. This dissertation presents a comprehensive compiler and runtime integration to automatically manage the code and data of each task in the limited local memory of the core. We firstly developed a Complete Circular Stack Management. It manages stack frames between the local memory and the main memory, and addresses the stack pointer problem as well. Though it works, we found we could further optimize the management for most cases. Thus a Smart Stack Data Management (SSDM) is provided. In this work, we formulate the stack data management problem and propose a greedy algorithm for the same. Later on, we propose a general cost estimation algorithm, based on which CMSM heuristic for code mapping problem is developed. Finally, heap data is dynamic in nature and therefore it is hard to manage it. We provide two schemes to manage unlimited amount of heap data in constant sized region in the local memory. In addition to those separate schemes for different kinds of data, we also provide a memory partition methodology.

ContributorsBai, Ke (Author) / Shrivastava, Aviral (Thesis advisor) / Chatha, Karamvir (Committee member) / Xue, Guoliang (Committee member) / Chakrabarti, Chaitali (Committee member) / Arizona State University (Publisher)

Created2014

Smart compilers for reliable and power-efficient embedded computing

Description

Thanks to continuous technology scaling, intelligent, fast and smaller digital systems are now available at affordable costs. As a result, digital systems have found use in a wide range of application areas that were not even imagined before, including medical (e.g., MRI, remote or post-operative monitoring devices, etc.), automotive (e.g.,…

Thanks to continuous technology scaling, intelligent, fast and smaller digital systems are now available at affordable costs. As a result, digital systems have found use in a wide range of application areas that were not even imagined before, including medical (e.g., MRI, remote or post-operative monitoring devices, etc.), automotive (e.g., adaptive cruise control, anti-lock brakes, etc.), security systems (e.g., residential security gateways, surveillance devices, etc.), and in- and out-of-body sensing (e.g., capsule swallowed by patients measuring digestive system pH, heart monitors, etc.). Such computing systems, which are completely embedded within the application, are called embedded systems, as opposed to general purpose computing systems. In the design of such embedded systems, power consumption and reliability are indispensable system requirements. In battery operated portable devices, the battery is the single largest factor contributing to device cost, weight, recharging time, frequency and ultimately its usability. For example, in the Apple iPhone 4 smart-phone, the battery is $40\%$ of the device weight, occupies $36\%$ of its volume and allows only $7$ hours (over 3G) of talk time. As embedded systems find use in a range of sensitive applications, from bio-medical applications to safety and security systems, the reliability of the computations performed becomes a crucial factor. At our current technology-node, portable embedded systems are prone to expect failures due to soft errors at the rate of once-per-year; but with aggressive technology scaling, the rate is predicted to increase exponentially to once-per-hour. Over the years, researchers have been successful in developing techniques, implemented at different layers of the design-spectrum, to improve system power efficiency and reliability. Among the layers of design abstraction, I observe that the interface between the compiler and processor micro-architecture possesses a unique potential for efficient design optimizations. A compiler designer is able to observe and analyze the application software at a finer granularity; while the processor architect analyzes the system output (power, performance, etc.) for each executed instruction. At the compiler micro-architecture interface, if the system knowledge at the two design layers can be integrated, design optimizations at the two layers can be modified to efficiently utilize available resources and thereby achieve appreciable system-level benefits. To this effect, the thesis statement is that, ``by merging system design information at the compiler and micro-architecture design layers, smart compilers can be developed, that achieve reliable and power-efficient embedded computing through: i) Pure compiler techniques, ii) Hybrid compiler micro-architecture techniques, and iii) Compiler-aware architectures''. In this dissertation demonstrates, through contributions in each of the three compiler-based techniques, the effectiveness of smart compilers in achieving power-efficiency and reliability in embedded systems.

ContributorsJeyapaul, Reiley (Author) / Shrivastava, Aviral (Thesis advisor) / Vrudhula, Sarma (Committee member) / Clark, Lawrence (Committee member) / Colbourn, Charles (Committee member) / Arizona State University (Publisher)

Created2012

Threshold logic properties and methods: applications to post-CMOS design automation and gene regulation modeling

Description

Threshold logic has been studied by at least two independent group of researchers. One group of researchers studied threshold logic with the intention of building threshold logic circuits. The earliest research to this end was done in the 1960's. The major work at that time focused on studying mathematical properties…

Threshold logic has been studied by at least two independent group of researchers. One group of researchers studied threshold logic with the intention of building threshold logic circuits. The earliest research to this end was done in the 1960's. The major work at that time focused on studying mathematical properties of threshold logic as no efficient circuit implementations of threshold logic were available. Recently many post-CMOS (Complimentary Metal Oxide Semiconductor) technologies that implement threshold logic have been proposed along with efficient CMOS implementations. This has renewed the effort to develop efficient threshold logic design automation techniques. This work contributes to this ongoing effort. Another group studying threshold logic did so, because the building block of neural networks - the Perceptron, is identical to the threshold element implementing a threshold function. Neural networks are used for various purposes as data classifiers. This work contributes tangentially to this field by proposing new methods and techniques to study and analyze functions implemented by a Perceptron After completion of the Human Genome Project, it has become evident that most biological phenomenon is not caused by the action of single genes, but due to the complex interaction involving a system of genes. In recent times, the `systems approach' for the study of gene systems is gaining popularity. Many different theories from mathematics and computer science has been used for this purpose. Among the systems approaches, the Boolean logic gene model has emerged as the current most popular discrete gene model. This work proposes a new gene model based on threshold logic functions (which are a subset of Boolean logic functions). The biological relevance and utility of this model is argued illustrated by using it to model different in-vivo as well as in-silico gene systems.

ContributorsLinge Gowda, Tejaswi (Author) / Vrudhula, Sarma (Thesis advisor) / Shrivastava, Aviral (Committee member) / Chatha, Karamvir (Committee member) / Kim, Seungchan (Committee member) / Arizona State University (Publisher)

Created2012

DSP algorithm and software development on the iPhone/iPad platform

Description

The ease of use of mobile devices and tablets by students has generated a lot of interest in the area of engineering education. By using mobile technologies in signal analysis and applied mathematics, undergraduate-level courses can broaden the scope and effectiveness of technical education in classrooms. The current mobile devices…

The ease of use of mobile devices and tablets by students has generated a lot of interest in the area of engineering education. By using mobile technologies in signal analysis and applied mathematics, undergraduate-level courses can broaden the scope and effectiveness of technical education in classrooms. The current mobile devices have abundant memory and powerful processors, in addition to providing interactive interfaces. Therefore, these devices can support the implementation of non-trivial signal processing algorithms. Several existing visual programming environments such as Java Digital Signal Processing (J-DSP), are built using the platform-independent infrastructure of Java applets. These enable students to perform signal-processing exercises over the Internet. However, some mobile devices do not support Java applets. Furthermore, mobile simulation environments rely heavily on establishing robust Internet connections with a remote server where the processing is performed. The interactive Java Digital Signal Processing tool (iJDSP) has been developed as graphical mobile app on iOS devices (iPads, iPhones and iPod touches). In contrast to existing mobile applications, iJDSP has the ability to execute simulations directly on the mobile devices, and is a completely stand-alone application. In addition to a substantial set of signal processing algorithms, iJDSP has a highly interactive graphical interface where block diagrams can be constructed using a simple drag-n-drop procedure. Functions such as visualization of the convolution operation, and an interface to wireless sensors have been developed. The convolution module animates the process of the continuous and discrete convolution operations, including time-shift and integration, so that users can observe and learn, intuitively. The current set of DSP functions in the application enables students to perform simulation exercises on continuous and discrete convolution, z-transform, filter design and the Fast Fourier Transform (FFT). The interface to wireless sensors in iJDSP allows users to import data from wireless sensor networks, and use the rich suite of functions in iJDSP for data processing. This allows users to perform operations such as localization, activity detection and data fusion. The exercises and the iJDSP application were evaluated by senior-level students at Arizona State University (ASU), and the results of those assessments are analyzed and reported in this thesis.

ContributorsHu, Shuang (Author) / Spanias, Andreas (Thesis advisor) / Tsakalis, Kostas (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Arizona State University (Publisher)

Created2012