Matching Items (6)
Filtering by

Clear all filters

152687-Thumbnail Image.png
Description
Learning by trial-and-error requires retrospective information that whether a past action resulted in a rewarded outcome. Previous outcome in turn may provide information to guide future behavioral adjustment. But the specific contribution of this information to learning a task and the neural representations during the trial-and-error learning process is not

Learning by trial-and-error requires retrospective information that whether a past action resulted in a rewarded outcome. Previous outcome in turn may provide information to guide future behavioral adjustment. But the specific contribution of this information to learning a task and the neural representations during the trial-and-error learning process is not well understood. In this dissertation, such learning is analyzed by means of single unit neural recordings in the rats' motor agranular medial (AGm) and agranular lateral (AGl) while the rats learned to perform a directional choice task. Multichannel chronic recordings using implanted microelectrodes in the rat's brain were essential to this study. Also for fundamental scientific investigations in general and for some applications such as brain machine interface, the recorded neural waveforms need to be analyzed first to identify neural action potentials as basic computing units. Prior to analyzing and modeling the recorded neural signals, this dissertation proposes an advanced spike sorting system, the M-Sorter, to extract the action potentials from raw neural waveforms. The M-Sorter shows better or comparable performance compared with two other popular spike sorters under automatic mode. With the sorted action potentials in place, neuronal activity in the AGm and AGl areas in rats during learning of a directional choice task is examined. Systematic analyses suggest that rat's neural activity in AGm and AGl was modulated by previous trial outcomes during learning. Single unit based neural dynamics during task learning are described in detail in the dissertation. Furthermore, the differences in neural modulation between fast and slow learning rats were compared. The results show that the level of neural modulation of previous trial outcome is different in fast and slow learning rats which may in turn suggest an important role of previous trial outcome encoding in learning.
ContributorsYuan, Yu'an (Author) / Si, Jennie (Thesis advisor) / Buneo, Christopher (Committee member) / Santello, Marco (Committee member) / Chae, Junseok (Committee member) / Arizona State University (Publisher)
Created2014
152691-Thumbnail Image.png
Description
Animals learn to choose a proper action among alternatives according to the circumstance. Through trial-and-error, animals improve their odds by making correct association between their behavioral choices and external stimuli. While there has been an extensive literature on the theory of learning, it is still unclear how individual neurons and

Animals learn to choose a proper action among alternatives according to the circumstance. Through trial-and-error, animals improve their odds by making correct association between their behavioral choices and external stimuli. While there has been an extensive literature on the theory of learning, it is still unclear how individual neurons and a neural network adapt as learning progresses. In this dissertation, single units in the medial and lateral agranular (AGm and AGl) cortices were recorded as rats learned a directional choice task. The task required the rat to make a left/right side lever press if a light cue appeared on the left/right side of the interface panel. Behavior analysis showed that rat's movement parameters during performance of directional choices became stereotyped very quickly (2-3 days) while learning to solve the directional choice problem took weeks to occur. The entire learning process was further broken down to 3 stages, each having similar number of recording sessions (days). Single unit based firing rate analysis revealed that 1) directional rate modulation was observed in both cortices; 2) the averaged mean rate between left and right trials in the neural ensemble each day did not change significantly among the three learning stages; 3) the rate difference between left and right trials of the ensemble did not change significantly either. Besides, for either left or right trials, the trial-to-trial firing variability of single neurons did not change significantly over the three stages. To explore the spatiotemporal neural pattern of the recorded ensemble, support vector machines (SVMs) were constructed each day to decode the direction of choice in single trials. Improved classification accuracy indicated enhanced discriminability between neural patterns of left and right choices as learning progressed. When using a restricted Boltzmann machine (RBM) model to extract features from neural activity patterns, results further supported the idea that neural firing patterns adapted during the three learning stages to facilitate the neural codes of directional choices. Put together, these findings suggest a spatiotemporal neural coding scheme in a rat AGl and AGm neural ensemble that may be responsible for and contributing to learning the directional choice task.
ContributorsMao, Hongwei (Author) / Si, Jennie (Thesis advisor) / Buneo, Christopher (Committee member) / Cao, Yu (Committee member) / Santello, Marco (Committee member) / Arizona State University (Publisher)
Created2014
Description
Peripheral Vascular Disease (PVD) is a debilitating chronic disease of the lower extremities particularly affecting older adults and diabetics. It results in reduction of the blood flow to peripheral tissue and sometimes causing tissue damage such that PVD patients suffer from pain in the lower legs, thigh and buttocks after

Peripheral Vascular Disease (PVD) is a debilitating chronic disease of the lower extremities particularly affecting older adults and diabetics. It results in reduction of the blood flow to peripheral tissue and sometimes causing tissue damage such that PVD patients suffer from pain in the lower legs, thigh and buttocks after activities. Electrical neurostimulation based on the "Gate Theory of Pain" is a known to way to reduce pain but current devices to do this are bulky and not well suited to implantation in peripheral tissues. There is also an increased risk associated with surgery which limits the use of these devices. This research has designed and constructed wireless ultrasound powered microstimulators that are much smaller and injectable and so involve less implantation trauma. These devices are small enough to fit through an 18 gauge syringe needle increasing their potential for clinical use. These piezoelectric microdevices convert mechanical energy into electrical energy that then is used to block pain. The design and performance of these miniaturized devices was modeled by computer while constructed devices were evaluated in animal experiments. The devices are capable of producing 500ms pulses with an intensity of 2 mA into a 2 kilo-ohms load. Using the rat as an animal model, a series of experiments were conducted to evaluate the in-vivo performance of the devices.
ContributorsZong, Xi (Author) / Towe, Bruce (Thesis advisor) / Kleim, Jeffrey (Committee member) / Santello, Marco (Committee member) / Arizona State University (Publisher)
Created2014
158010-Thumbnail Image.png
Description
Robotic lower limb prostheses provide new opportunities to help transfemoral amputees regain mobility. However, their application is impeded by that the impedance control parameters need to be tuned and optimized manually by prosthetists for each individual user in different task environments. Reinforcement learning (RL) is capable of automatically learning from

Robotic lower limb prostheses provide new opportunities to help transfemoral amputees regain mobility. However, their application is impeded by that the impedance control parameters need to be tuned and optimized manually by prosthetists for each individual user in different task environments. Reinforcement learning (RL) is capable of automatically learning from interacting with the environment. It becomes a natural candidate to replace human prosthetists to customize the control parameters. However, neither traditional RL approaches nor the popular deep RL approaches are readily suitable for learning with limited number of samples and samples with large variations. This dissertation aims to explore new RL based adaptive solutions that are data-efficient for controlling robotic prostheses.

This dissertation begins by proposing a new flexible policy iteration (FPI) framework. To improve sample efficiency, FPI can utilize either on-policy or off-policy learning strategy, can learn from either online or offline data, and can even adopt exiting knowledge of an external critic. Approximate convergence to Bellman optimal solutions are guaranteed under mild conditions. Simulation studies validated that FPI was data efficient compared to several established RL methods. Furthermore, a simplified version of FPI was implemented to learn from offline data, and then the learned policy was successfully tested for tuning the control parameters online on a human subject.

Next, the dissertation discusses RL control with information transfer (RL-IT), or knowledge-guided RL (KG-RL), which is motivated to benefit from transferring knowledge acquired from one subject to another. To explore its feasibility, knowledge was extracted from data measurements of able-bodied (AB) subjects, and transferred to guide Q-learning control for an amputee in OpenSim simulations. This result again demonstrated that data and time efficiency were improved using previous knowledge.

While the present study is new and promising, there are still many open questions to be addressed in future research. To account for human adaption, the learning control objective function may be designed to incorporate human-prosthesis performance feedback such as symmetry, user comfort level and satisfaction, and user energy consumption. To make the RL based control parameter tuning practical in real life, it should be further developed and tested in different use environments, such as from level ground walking to stair ascending or descending, and from walking to running.
ContributorsGao, Xiang (Author) / Si, Jennie (Thesis advisor) / Huang, He Helen (Committee member) / Santello, Marco (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Arizona State University (Publisher)
Created2020
158636-Thumbnail Image.png
Description
According to the Center for Disease Control and Prevention report around 29,668 United States residents aged greater than 65 years had died as a result of a fall in 2016. Other injuries like wrist fractures, hip fractures, and head injuries occur as a result of a fall. Certain groups of

According to the Center for Disease Control and Prevention report around 29,668 United States residents aged greater than 65 years had died as a result of a fall in 2016. Other injuries like wrist fractures, hip fractures, and head injuries occur as a result of a fall. Certain groups of people are more prone to experience falls than others, one of which being individuals with stroke. The two most common issues with individuals with strokes are ankle weakness and foot drop, both of which contribute to falls. To mitigate this issue, the most popular clinical remedy given to these users is thermoplastic Ankle Foot Orthosis. These AFO's help improving gait velocity, stride length, and cadence. However, studies have shown that a continuous restraint on the ankle harms the compensatory stepping response and forward propulsion. It has been shown in previous studies that compensatory stepping and forward propulsion are crucial for the user's ability to recover from postural perturbations. Hence, there is a need for active devices that can supply a plantarflexion during the push-off and dorsiflexion during the swing phase of gait. Although advancements in the orthotic research have shown major improvements in supporting the ankle joint for rehabilitation, there is a lack of available active devices that can help impaired users in daily activities. In this study, our primary focus is to build an unobtrusive, cost-effective, and easy to wear active device for gait rehabilitation and fall prevention in individuals who are at risk. The device will be using a double-acting cylinder that can be easily incorporated into the user's footwear using a novel custom-designed powered ankle brace. The device will use Inertial Measurement Units to measure kinematic parameters of the lower body and a custom control algorithm to actuate the device based on the measurements. The study can be used to advance the field of gait assistance, rehabilitation, and potentially fall prevention of individuals with lower-limb impairments through the use of Active Ankle Foot Orthosis.
ContributorsRay, Sambarta (Author) / Honeycutt, Claire (Thesis advisor) / Dasarathy, Gautam (Thesis advisor) / Redkar, Sangram (Committee member) / Jayasuriya, Suren (Committee member) / Arizona State University (Publisher)
Created2020
191018-Thumbnail Image.png
Description
This dissertation focuses on reinforcement learning (RL) controller design aiming for real-life applications in continuous state and control problems. It involves three major research investigations in the aspect of design, analysis, implementation, and evaluation. The application case addresses automatically configuring robotic prosthesis impedance parameters. Major contributions of the dissertation include

This dissertation focuses on reinforcement learning (RL) controller design aiming for real-life applications in continuous state and control problems. It involves three major research investigations in the aspect of design, analysis, implementation, and evaluation. The application case addresses automatically configuring robotic prosthesis impedance parameters. Major contributions of the dissertation include the following. 1) An “echo control” using the intact knee profile as target is designed to overcome the limitation of a designer prescribed robotic knee profile. 2) Collaborative multiagent reinforcement learning (cMARL) is proposed to directly take into account human influence in the robot control design. 3) A phased actor in actor-critic (PAAC) reinforcement learning method is developed to reduce learning variance in RL. The design of an “echo control” is based on a new formulation of direct heuristic dynamic programming (dHDP) for tracking control of a robotic knee prosthesis to mimic the intact knee profile. A systematic simulation of the proposed control is provided using a human-robot system simulation in OpenSim. The tracking controller is then tested on able-bodied and amputee subjects. This is the first real-time human testing of RL tracking control of a robotic knee to mirror the profile of an intact knee. The cMARL is a new solution framework for the human-prosthesis collaboration (HPC) problem. This is the first attempt at considering human influence on human-robot walking with the presence of a reinforcement learning controlled lower limb prosthesis. Results show that treating the human and robot as coupled and collaborating agents and using an estimated human adaptation in robot control design help improve human walking performance. The above studies have demonstrated great potential of RL control in solving continuous problems. To solve more complex real-life tasks with multiple control inputs and high dimensional state space, high variance, low data efficiency, slow learning or even instability are major roadblocks to be addressed. A novel PAAC method is proposed to improve learning performance in policy gradient RL by accounting for both Q value and TD error in actor updates. Systematical and comprehensive demonstrations show its effectiveness by qualitative analysis and quantitative evaluation in DeepMind Control Suite.
ContributorsWu, Ruofan (Author) / Si, Jennie (Thesis advisor) / Huang, He (Committee member) / Santello, Marco (Committee member) / Papandreou- Suppappola, Antonia (Committee member) / Arizona State University (Publisher)
Created2023