This collection includes both ASU Theses and Dissertations, submitted by graduate students, and the Barrett, Honors College theses submitted by undergraduate students. 

Displaying 1 - 6 of 6
Filtering by

Clear all filters

152687-Thumbnail Image.png
Description
Learning by trial-and-error requires retrospective information that whether a past action resulted in a rewarded outcome. Previous outcome in turn may provide information to guide future behavioral adjustment. But the specific contribution of this information to learning a task and the neural representations during the trial-and-error learning process is not

Learning by trial-and-error requires retrospective information that whether a past action resulted in a rewarded outcome. Previous outcome in turn may provide information to guide future behavioral adjustment. But the specific contribution of this information to learning a task and the neural representations during the trial-and-error learning process is not well understood. In this dissertation, such learning is analyzed by means of single unit neural recordings in the rats' motor agranular medial (AGm) and agranular lateral (AGl) while the rats learned to perform a directional choice task. Multichannel chronic recordings using implanted microelectrodes in the rat's brain were essential to this study. Also for fundamental scientific investigations in general and for some applications such as brain machine interface, the recorded neural waveforms need to be analyzed first to identify neural action potentials as basic computing units. Prior to analyzing and modeling the recorded neural signals, this dissertation proposes an advanced spike sorting system, the M-Sorter, to extract the action potentials from raw neural waveforms. The M-Sorter shows better or comparable performance compared with two other popular spike sorters under automatic mode. With the sorted action potentials in place, neuronal activity in the AGm and AGl areas in rats during learning of a directional choice task is examined. Systematic analyses suggest that rat's neural activity in AGm and AGl was modulated by previous trial outcomes during learning. Single unit based neural dynamics during task learning are described in detail in the dissertation. Furthermore, the differences in neural modulation between fast and slow learning rats were compared. The results show that the level of neural modulation of previous trial outcome is different in fast and slow learning rats which may in turn suggest an important role of previous trial outcome encoding in learning.
ContributorsYuan, Yu'an (Author) / Si, Jennie (Thesis advisor) / Buneo, Christopher (Committee member) / Santello, Marco (Committee member) / Chae, Junseok (Committee member) / Arizona State University (Publisher)
Created2014
152691-Thumbnail Image.png
Description
Animals learn to choose a proper action among alternatives according to the circumstance. Through trial-and-error, animals improve their odds by making correct association between their behavioral choices and external stimuli. While there has been an extensive literature on the theory of learning, it is still unclear how individual neurons and

Animals learn to choose a proper action among alternatives according to the circumstance. Through trial-and-error, animals improve their odds by making correct association between their behavioral choices and external stimuli. While there has been an extensive literature on the theory of learning, it is still unclear how individual neurons and a neural network adapt as learning progresses. In this dissertation, single units in the medial and lateral agranular (AGm and AGl) cortices were recorded as rats learned a directional choice task. The task required the rat to make a left/right side lever press if a light cue appeared on the left/right side of the interface panel. Behavior analysis showed that rat's movement parameters during performance of directional choices became stereotyped very quickly (2-3 days) while learning to solve the directional choice problem took weeks to occur. The entire learning process was further broken down to 3 stages, each having similar number of recording sessions (days). Single unit based firing rate analysis revealed that 1) directional rate modulation was observed in both cortices; 2) the averaged mean rate between left and right trials in the neural ensemble each day did not change significantly among the three learning stages; 3) the rate difference between left and right trials of the ensemble did not change significantly either. Besides, for either left or right trials, the trial-to-trial firing variability of single neurons did not change significantly over the three stages. To explore the spatiotemporal neural pattern of the recorded ensemble, support vector machines (SVMs) were constructed each day to decode the direction of choice in single trials. Improved classification accuracy indicated enhanced discriminability between neural patterns of left and right choices as learning progressed. When using a restricted Boltzmann machine (RBM) model to extract features from neural activity patterns, results further supported the idea that neural firing patterns adapted during the three learning stages to facilitate the neural codes of directional choices. Put together, these findings suggest a spatiotemporal neural coding scheme in a rat AGl and AGm neural ensemble that may be responsible for and contributing to learning the directional choice task.
ContributorsMao, Hongwei (Author) / Si, Jennie (Thesis advisor) / Buneo, Christopher (Committee member) / Cao, Yu (Committee member) / Santello, Marco (Committee member) / Arizona State University (Publisher)
Created2014
150499-Thumbnail Image.png
Description
The ability to plan, execute, and control goal oriented reaching and grasping movements is among the most essential functions of the brain. Yet, these movements are inherently variable; a result of the noise pervading the neural signals underlying sensorimotor processing. The specific influences and interactions of these noise processes remain

The ability to plan, execute, and control goal oriented reaching and grasping movements is among the most essential functions of the brain. Yet, these movements are inherently variable; a result of the noise pervading the neural signals underlying sensorimotor processing. The specific influences and interactions of these noise processes remain unclear. Thus several studies have been performed to elucidate the role and influence of sensorimotor noise on movement variability. The first study focuses on sensory integration and movement planning across the reaching workspace. An experiment was designed to examine the relative contributions of vision and proprioception to movement planning by measuring the rotation of the initial movement direction induced by a perturbation of the visual feedback prior to movement onset. The results suggest that contribution of vision was relatively consistent across the evaluated workspace depths; however, the influence of vision differed between the vertical and later axes indicate that additional factors beyond vision and proprioception influence movement planning of 3-dimensional movements. If the first study investigated the role of noise in sensorimotor integration, the second and third studies investigate relative influence of sensorimotor noise on reaching performance. Specifically, they evaluate how the characteristics of neural processing that underlie movement planning and execution manifest in movement variability during natural reaching. Subjects performed reaching movements with and without visual feedback throughout the movement and the patterns of endpoint variability were compared across movement directions. The results of these studies suggest a primary role of visual feedback noise in shaping patterns of variability and in determining the relative influence of planning and execution related noise sources. The final work considers a computational approach to characterizing how sensorimotor processes interact to shape movement variability. A model of multi-modal feedback control was developed to simulate the interaction of planning and execution noise on reaching variability. The model predictions suggest that anisotropic properties of feedback noise significantly affect the relative influence of planning and execution noise on patterns of reaching variability.
ContributorsApker, Gregory Allen (Author) / Buneo, Christopher A (Thesis advisor) / Helms Tillery, Stephen (Committee member) / Santello, Marco (Committee member) / Santos, Veronica (Committee member) / Si, Jennie (Committee member) / Arizona State University (Publisher)
Created2012
154148-Thumbnail Image.png
Description
Brain-machine interfaces (BMIs) were first imagined as a technology that would allow subjects to have direct communication with prosthetics and external devices (e.g. control over a computer cursor or robotic arm movement). Operation of these devices was not automatic, and subjects needed calibration and training in order to master this

Brain-machine interfaces (BMIs) were first imagined as a technology that would allow subjects to have direct communication with prosthetics and external devices (e.g. control over a computer cursor or robotic arm movement). Operation of these devices was not automatic, and subjects needed calibration and training in order to master this control. In short, learning became a key component in controlling these systems. As a result, BMIs have become ideal tools to probe and explore brain activity, since they allow the isolation of neural inputs and systematic altering of the relationships between the neural signals and output. I have used BMIs to explore the process of brain adaptability in a motor-like task. To this end, I trained non-human primates to control a 3D cursor and adapt to two different perturbations: a visuomotor rotation, uniform across the neural ensemble, and a decorrelation task, which non-uniformly altered the relationship between the activity of particular neurons in an ensemble and movement output. I measured individual and population level changes in the neural ensemble as subjects honed their skills over the span of several days. I found some similarities in the adaptation process elicited by these two tasks. On one hand, individual neurons displayed tuning changes across the entire ensemble after task adaptation: most neurons displayed transient changes in their preferred directions, and most neuron pairs showed changes in their cross-correlations during the learning process. On the other hand, I also measured population level adaptation in the neural ensemble: the underlying neural manifolds that control these neural signals also had dynamic changes during adaptation. I have found that the neural circuits seem to apply an exploratory strategy when adapting to new tasks. Our results suggest that information and trajectories in the neural space increase after initially introducing the perturbations, and before the subject settles into workable solutions. These results provide new insights into both the underlying population level processes in motor learning, and the changes in neural coding which are necessary for subjects to learn to control neuroprosthetics. Understanding of these mechanisms can help us create better control algorithms, and design training paradigms that will take advantage of these processes.
ContributorsArmenta Salas, Michelle (Author) / Helms Tillery, Stephen I (Thesis advisor) / Si, Jennie (Committee member) / Buneo, Christopher (Committee member) / Santello, Marco (Committee member) / Kleim, Jeffrey (Committee member) / Arizona State University (Publisher)
Created2015
158010-Thumbnail Image.png
Description
Robotic lower limb prostheses provide new opportunities to help transfemoral amputees regain mobility. However, their application is impeded by that the impedance control parameters need to be tuned and optimized manually by prosthetists for each individual user in different task environments. Reinforcement learning (RL) is capable of automatically learning from

Robotic lower limb prostheses provide new opportunities to help transfemoral amputees regain mobility. However, their application is impeded by that the impedance control parameters need to be tuned and optimized manually by prosthetists for each individual user in different task environments. Reinforcement learning (RL) is capable of automatically learning from interacting with the environment. It becomes a natural candidate to replace human prosthetists to customize the control parameters. However, neither traditional RL approaches nor the popular deep RL approaches are readily suitable for learning with limited number of samples and samples with large variations. This dissertation aims to explore new RL based adaptive solutions that are data-efficient for controlling robotic prostheses.

This dissertation begins by proposing a new flexible policy iteration (FPI) framework. To improve sample efficiency, FPI can utilize either on-policy or off-policy learning strategy, can learn from either online or offline data, and can even adopt exiting knowledge of an external critic. Approximate convergence to Bellman optimal solutions are guaranteed under mild conditions. Simulation studies validated that FPI was data efficient compared to several established RL methods. Furthermore, a simplified version of FPI was implemented to learn from offline data, and then the learned policy was successfully tested for tuning the control parameters online on a human subject.

Next, the dissertation discusses RL control with information transfer (RL-IT), or knowledge-guided RL (KG-RL), which is motivated to benefit from transferring knowledge acquired from one subject to another. To explore its feasibility, knowledge was extracted from data measurements of able-bodied (AB) subjects, and transferred to guide Q-learning control for an amputee in OpenSim simulations. This result again demonstrated that data and time efficiency were improved using previous knowledge.

While the present study is new and promising, there are still many open questions to be addressed in future research. To account for human adaption, the learning control objective function may be designed to incorporate human-prosthesis performance feedback such as symmetry, user comfort level and satisfaction, and user energy consumption. To make the RL based control parameter tuning practical in real life, it should be further developed and tested in different use environments, such as from level ground walking to stair ascending or descending, and from walking to running.
ContributorsGao, Xiang (Author) / Si, Jennie (Thesis advisor) / Huang, He Helen (Committee member) / Santello, Marco (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Arizona State University (Publisher)
Created2020
191018-Thumbnail Image.png
Description
This dissertation focuses on reinforcement learning (RL) controller design aiming for real-life applications in continuous state and control problems. It involves three major research investigations in the aspect of design, analysis, implementation, and evaluation. The application case addresses automatically configuring robotic prosthesis impedance parameters. Major contributions of the dissertation include

This dissertation focuses on reinforcement learning (RL) controller design aiming for real-life applications in continuous state and control problems. It involves three major research investigations in the aspect of design, analysis, implementation, and evaluation. The application case addresses automatically configuring robotic prosthesis impedance parameters. Major contributions of the dissertation include the following. 1) An “echo control” using the intact knee profile as target is designed to overcome the limitation of a designer prescribed robotic knee profile. 2) Collaborative multiagent reinforcement learning (cMARL) is proposed to directly take into account human influence in the robot control design. 3) A phased actor in actor-critic (PAAC) reinforcement learning method is developed to reduce learning variance in RL. The design of an “echo control” is based on a new formulation of direct heuristic dynamic programming (dHDP) for tracking control of a robotic knee prosthesis to mimic the intact knee profile. A systematic simulation of the proposed control is provided using a human-robot system simulation in OpenSim. The tracking controller is then tested on able-bodied and amputee subjects. This is the first real-time human testing of RL tracking control of a robotic knee to mirror the profile of an intact knee. The cMARL is a new solution framework for the human-prosthesis collaboration (HPC) problem. This is the first attempt at considering human influence on human-robot walking with the presence of a reinforcement learning controlled lower limb prosthesis. Results show that treating the human and robot as coupled and collaborating agents and using an estimated human adaptation in robot control design help improve human walking performance. The above studies have demonstrated great potential of RL control in solving continuous problems. To solve more complex real-life tasks with multiple control inputs and high dimensional state space, high variance, low data efficiency, slow learning or even instability are major roadblocks to be addressed. A novel PAAC method is proposed to improve learning performance in policy gradient RL by accounting for both Q value and TD error in actor updates. Systematical and comprehensive demonstrations show its effectiveness by qualitative analysis and quantitative evaluation in DeepMind Control Suite.
ContributorsWu, Ruofan (Author) / Si, Jennie (Thesis advisor) / Huang, He (Committee member) / Santello, Marco (Committee member) / Papandreou- Suppappola, Antonia (Committee member) / Arizona State University (Publisher)
Created2023