Matching Items (4)
Filtering by

Clear all filters

149913-Thumbnail Image.png
Description
One necessary condition for the two-pass risk premium estimator to be consistent and asymptotically normal is that the rank of the beta matrix in a proposed linear asset pricing model is full column. I first investigate the asymptotic properties of the risk premium estimators and the related t-test and

One necessary condition for the two-pass risk premium estimator to be consistent and asymptotically normal is that the rank of the beta matrix in a proposed linear asset pricing model is full column. I first investigate the asymptotic properties of the risk premium estimators and the related t-test and Wald test statistics when the full rank condition fails. I show that the beta risk of useless factors or multiple proxy factors for a true factor are priced more often than they should be at the nominal size in the asset pricing models omitting some true factors. While under the null hypothesis that the risk premiums of the true factors are equal to zero, the beta risk of the true factors are priced less often than the nominal size. The simulation results are consistent with the theoretical findings. Hence, the factor selection in a proposed factor model should not be made solely based on their estimated risk premiums. In response to this problem, I propose an alternative estimation of the underlying factor structure. Specifically, I propose to use the linear combination of factors weighted by the eigenvectors of the inner product of estimated beta matrix. I further propose a new method to estimate the rank of the beta matrix in a factor model. For this method, the idiosyncratic components of asset returns are allowed to be correlated both over different cross-sectional units and over different time periods. The estimator I propose is easy to use because it is computed with the eigenvalues of the inner product of an estimated beta matrix. Simulation results show that the proposed method works well even in small samples. The analysis of US individual stock returns suggests that there are six common risk factors in US individual stock returns among the thirteen factor candidates used. The analysis of portfolio returns reveals that the estimated number of common factors changes depending on how the portfolios are constructed. The number of risk sources found from the analysis of portfolio returns is generally smaller than the number found in individual stock returns.
ContributorsWang, Na (Author) / Ahn, Seung C. (Thesis advisor) / Kallberg, Jarl G. (Committee member) / Liu, Crocker H. (Committee member) / Arizona State University (Publisher)
Created2011
149506-Thumbnail Image.png
Description
A systematic top down approach to minimize risk and maximize the profits of an investment over a given period of time is proposed. Macroeconomic factors such as Gross Domestic Product (GDP), Consumer Price Index (CPI), Outstanding Consumer Credit, Industrial Production Index, Money Supply (MS), Unemployment Rate, and Ten-Year Treasury are

A systematic top down approach to minimize risk and maximize the profits of an investment over a given period of time is proposed. Macroeconomic factors such as Gross Domestic Product (GDP), Consumer Price Index (CPI), Outstanding Consumer Credit, Industrial Production Index, Money Supply (MS), Unemployment Rate, and Ten-Year Treasury are used to predict/estimate asset (sector ETF`s) returns. Fundamental ratios of individual stocks are used to predict the stock returns. An a priori known cash-flow sequence is assumed available for investment. Given the importance of sector performance on stock performance, sector based Exchange Traded Funds (ETFs) for the S&P; and Dow Jones are considered and wealth is allocated. Mean variance optimization with risk and return constraints are used to distribute the wealth in individual sectors among the selected stocks. The results presented should be viewed as providing an outer control/decision loop generating sector target allocations that will ultimately drive an inner control/decision loop focusing on stock selection. Receding horizon control (RHC) ideas are exploited to pose and solve two relevant constrained optimization problems. First, the classic problem of wealth maximization subject to risk constraints (as measured by a metric on the covariance matrices) is considered. Special consideration is given to an optimization problem that attempts to minimize the peak risk over the prediction horizon, while trying to track a wealth objective. It is concluded that this approach may be particularly beneficial during downturns - appreciably limiting downside during downturns while providing most of the upside during upturns. Investment in stocks during upturns and in sector ETF`s during downturns is profitable.
ContributorsChitturi, Divakar (Author) / Rodriguez, Armando (Thesis advisor) / Tsakalis, Konstantinos S (Committee member) / Si, Jennie (Committee member) / Arizona State University (Publisher)
Created2010
191018-Thumbnail Image.png
Description
This dissertation focuses on reinforcement learning (RL) controller design aiming for real-life applications in continuous state and control problems. It involves three major research investigations in the aspect of design, analysis, implementation, and evaluation. The application case addresses automatically configuring robotic prosthesis impedance parameters. Major contributions of the dissertation include

This dissertation focuses on reinforcement learning (RL) controller design aiming for real-life applications in continuous state and control problems. It involves three major research investigations in the aspect of design, analysis, implementation, and evaluation. The application case addresses automatically configuring robotic prosthesis impedance parameters. Major contributions of the dissertation include the following. 1) An “echo control” using the intact knee profile as target is designed to overcome the limitation of a designer prescribed robotic knee profile. 2) Collaborative multiagent reinforcement learning (cMARL) is proposed to directly take into account human influence in the robot control design. 3) A phased actor in actor-critic (PAAC) reinforcement learning method is developed to reduce learning variance in RL. The design of an “echo control” is based on a new formulation of direct heuristic dynamic programming (dHDP) for tracking control of a robotic knee prosthesis to mimic the intact knee profile. A systematic simulation of the proposed control is provided using a human-robot system simulation in OpenSim. The tracking controller is then tested on able-bodied and amputee subjects. This is the first real-time human testing of RL tracking control of a robotic knee to mirror the profile of an intact knee. The cMARL is a new solution framework for the human-prosthesis collaboration (HPC) problem. This is the first attempt at considering human influence on human-robot walking with the presence of a reinforcement learning controlled lower limb prosthesis. Results show that treating the human and robot as coupled and collaborating agents and using an estimated human adaptation in robot control design help improve human walking performance. The above studies have demonstrated great potential of RL control in solving continuous problems. To solve more complex real-life tasks with multiple control inputs and high dimensional state space, high variance, low data efficiency, slow learning or even instability are major roadblocks to be addressed. A novel PAAC method is proposed to improve learning performance in policy gradient RL by accounting for both Q value and TD error in actor updates. Systematical and comprehensive demonstrations show its effectiveness by qualitative analysis and quantitative evaluation in DeepMind Control Suite.
ContributorsWu, Ruofan (Author) / Si, Jennie (Thesis advisor) / Huang, He (Committee member) / Santello, Marco (Committee member) / Papandreou- Suppappola, Antonia (Committee member) / Arizona State University (Publisher)
Created2023
158010-Thumbnail Image.png
Description
Robotic lower limb prostheses provide new opportunities to help transfemoral amputees regain mobility. However, their application is impeded by that the impedance control parameters need to be tuned and optimized manually by prosthetists for each individual user in different task environments. Reinforcement learning (RL) is capable of automatically learning from

Robotic lower limb prostheses provide new opportunities to help transfemoral amputees regain mobility. However, their application is impeded by that the impedance control parameters need to be tuned and optimized manually by prosthetists for each individual user in different task environments. Reinforcement learning (RL) is capable of automatically learning from interacting with the environment. It becomes a natural candidate to replace human prosthetists to customize the control parameters. However, neither traditional RL approaches nor the popular deep RL approaches are readily suitable for learning with limited number of samples and samples with large variations. This dissertation aims to explore new RL based adaptive solutions that are data-efficient for controlling robotic prostheses.

This dissertation begins by proposing a new flexible policy iteration (FPI) framework. To improve sample efficiency, FPI can utilize either on-policy or off-policy learning strategy, can learn from either online or offline data, and can even adopt exiting knowledge of an external critic. Approximate convergence to Bellman optimal solutions are guaranteed under mild conditions. Simulation studies validated that FPI was data efficient compared to several established RL methods. Furthermore, a simplified version of FPI was implemented to learn from offline data, and then the learned policy was successfully tested for tuning the control parameters online on a human subject.

Next, the dissertation discusses RL control with information transfer (RL-IT), or knowledge-guided RL (KG-RL), which is motivated to benefit from transferring knowledge acquired from one subject to another. To explore its feasibility, knowledge was extracted from data measurements of able-bodied (AB) subjects, and transferred to guide Q-learning control for an amputee in OpenSim simulations. This result again demonstrated that data and time efficiency were improved using previous knowledge.

While the present study is new and promising, there are still many open questions to be addressed in future research. To account for human adaption, the learning control objective function may be designed to incorporate human-prosthesis performance feedback such as symmetry, user comfort level and satisfaction, and user energy consumption. To make the RL based control parameter tuning practical in real life, it should be further developed and tested in different use environments, such as from level ground walking to stair ascending or descending, and from walking to running.
ContributorsGao, Xiang (Author) / Si, Jennie (Thesis advisor) / Huang, He Helen (Committee member) / Santello, Marco (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Arizona State University (Publisher)
Created2020