Filtering by
- All Subjects: Computer Science
- Member of: Theses and Dissertations
- Status: Published
Motor learning is the process of improving task execution according to some measure of performance. This can be divided into skill learning, a model-free process, and adaptation, a model-based process. Prior studies have indicated that adaptation results from two complementary learning systems with parallel organization. This report attempted to answer the question of whether a similar interaction leads to savings, a model-free process that is described as faster relearning when experiencing something familiar. This was tested in a two-week reaching task conducted on a robotic arm capable of perturbing movements. The task was designed so that the two sessions differed in their history of errors. By measuring the change in the learning rate, the savings was determined at various points. The results showed that the history of errors successfully modulated savings. Thus, this supports the notion that the two complementary systems interact to develop savings. Additionally, this report was part of a larger study that will explore the organizational structure of the complementary systems as well as the neural basis of this motor learning.
This dissertation begins by proposing a new flexible policy iteration (FPI) framework. To improve sample efficiency, FPI can utilize either on-policy or off-policy learning strategy, can learn from either online or offline data, and can even adopt exiting knowledge of an external critic. Approximate convergence to Bellman optimal solutions are guaranteed under mild conditions. Simulation studies validated that FPI was data efficient compared to several established RL methods. Furthermore, a simplified version of FPI was implemented to learn from offline data, and then the learned policy was successfully tested for tuning the control parameters online on a human subject.
Next, the dissertation discusses RL control with information transfer (RL-IT), or knowledge-guided RL (KG-RL), which is motivated to benefit from transferring knowledge acquired from one subject to another. To explore its feasibility, knowledge was extracted from data measurements of able-bodied (AB) subjects, and transferred to guide Q-learning control for an amputee in OpenSim simulations. This result again demonstrated that data and time efficiency were improved using previous knowledge.
While the present study is new and promising, there are still many open questions to be addressed in future research. To account for human adaption, the learning control objective function may be designed to incorporate human-prosthesis performance feedback such as symmetry, user comfort level and satisfaction, and user energy consumption. To make the RL based control parameter tuning practical in real life, it should be further developed and tested in different use environments, such as from level ground walking to stair ascending or descending, and from walking to running.