Design of Reinforcement Learning Controllers with Application to Robotic Knee Tuning with Human in the Loop

191018-Thumbnail Image.png
Description
This dissertation focuses on reinforcement learning (RL) controller design aiming for real-life applications in continuous state and control problems. It involves three major research investigations in the aspect of design, analysis, implementation, and evaluation. The application case addresses automatically configuring

This dissertation focuses on reinforcement learning (RL) controller design aiming for real-life applications in continuous state and control problems. It involves three major research investigations in the aspect of design, analysis, implementation, and evaluation. The application case addresses automatically configuring robotic prosthesis impedance parameters. Major contributions of the dissertation include the following. 1) An “echo control” using the intact knee profile as target is designed to overcome the limitation of a designer prescribed robotic knee profile. 2) Collaborative multiagent reinforcement learning (cMARL) is proposed to directly take into account human influence in the robot control design. 3) A phased actor in actor-critic (PAAC) reinforcement learning method is developed to reduce learning variance in RL. The design of an “echo control” is based on a new formulation of direct heuristic dynamic programming (dHDP) for tracking control of a robotic knee prosthesis to mimic the intact knee profile. A systematic simulation of the proposed control is provided using a human-robot system simulation in OpenSim. The tracking controller is then tested on able-bodied and amputee subjects. This is the first real-time human testing of RL tracking control of a robotic knee to mirror the profile of an intact knee. The cMARL is a new solution framework for the human-prosthesis collaboration (HPC) problem. This is the first attempt at considering human influence on human-robot walking with the presence of a reinforcement learning controlled lower limb prosthesis. Results show that treating the human and robot as coupled and collaborating agents and using an estimated human adaptation in robot control design help improve human walking performance. The above studies have demonstrated great potential of RL control in solving continuous problems. To solve more complex real-life tasks with multiple control inputs and high dimensional state space, high variance, low data efficiency, slow learning or even instability are major roadblocks to be addressed. A novel PAAC method is proposed to improve learning performance in policy gradient RL by accounting for both Q value and TD error in actor updates. Systematical and comprehensive demonstrations show its effectiveness by qualitative analysis and quantitative evaluation in DeepMind Control Suite.
Date Created
2023
Agent

Automatic segmentation of single neurons recorded by wide-field imaging using frequency domain features and clustering tree

155016-Thumbnail Image.png
Description
Recent new experiments showed that wide-field imaging at millimeter scale is capable of recording hundreds of neurons in behaving mice brain. Monitoring hundreds of individual neurons at a high frame rate provides a promising tool for discovering spatiotemporal features of

Recent new experiments showed that wide-field imaging at millimeter scale is capable of recording hundreds of neurons in behaving mice brain. Monitoring hundreds of individual neurons at a high frame rate provides a promising tool for discovering spatiotemporal features of large neural networks. However, processing the massive data sets is impossible without automated procedures. Thus, this thesis aims at developing a new tool to automatically segment and track individual neuron cells. The new method used in this study employs two major ideas including feature extraction based on power spectral density of single neuron temporal activity and clustering tree to separate overlapping cells. To address issues associated with high-resolution imaging of a large recording area, focused areas and out-of-focus areas were analyzed separately. A static segmentation with a fixed PSD thresholding method is applied to within focus visual field. A dynamic segmentation by comparing maximum PSD with surrounding pixels is applied to out-of-focus area. Both approaches helped remove irrelevant pixels in the background. After detection of potential single cells, some of which appeared in groups due to overlapping cells in the image, a hierarchical clustering algorithm is applied to separate them. The hierarchical clustering uses correlation coefficient as a distance measurement to group similar pixels into single cells. As such, overlapping cells can be separated. We tested the entire algorithm using two real recordings with the respective truth carefully determined by manual inspections. The results show high accuracy on tested datasets while false positive error is controlled within an acceptable range. Furthermore, results indicate robustness of the algorithm when applied to different image sequences.
Date Created
2016
Agent