Matching Items (2)

Efficiency Based Flight Analysis for a Novel Quadcopter System

Description

For a conventional quadcopter system with 4 planar rotors, flight times vary between 10 to 20 minutes depending on the weight of the quadcopter and the size of the battery used. In order to increase the flight time, either the

For a conventional quadcopter system with 4 planar rotors, flight times vary between 10 to 20 minutes depending on the weight of the quadcopter and the size of the battery used. In order to increase the flight time, either the weight of the quadcopter should be reduced or the battery size should be increased. Another way is to increase the efficiency of the propellers. Previous research shows that ducting a propeller can cause an increase of up to 94 % in the thrust produced by the rotor-duct system. This research focused on developing and testing a quadcopter having a centrally ducted rotor which produces 60 % of the total system thrust and 3 other peripheral rotors. This quadcopter will provide longer flight times while having the same maneuvering flexibility in planar movements.

Contributors

Agent

Created

Date Created
2019

171816-Thumbnail Image.png

Combining learning with knowledge-rich planning allows for efficient multi-agent solutions to the problem of perpetual sparse rewards

Description

This work has improved the quality of the solution to the sparse rewards problemby combining reinforcement learning (RL) with knowledge-rich planning. Classical
methods for coping with sparse rewards during reinforcement learning modify the
reward landscape so as to better guide

This work has improved the quality of the solution to the sparse rewards problemby combining reinforcement learning (RL) with knowledge-rich planning. Classical
methods for coping with sparse rewards during reinforcement learning modify the
reward landscape so as to better guide the learner. In contrast, this work combines
RL with a planner in order to utilize other information about the environment. As
the scope for representing environmental information is limited in RL, this work has
conflated a model-free learning algorithm – temporal difference (TD) learning – with
a Hierarchical Task Network (HTN) planner to accommodate rich environmental
information in the algorithm. In the perpetual sparse rewards problem, rewards
reemerge after being collected within a fixed interval of time, culminating in a lack of a
well-defined goal state as an exit condition to the problem. Incorporating planning in
the learning algorithm not only improves the quality of the solution, but the algorithm
also avoids the ambiguity of incorporating a goal of maximizing profit while using
only a planning algorithm to solve this problem. Upon occasionally using the HTN
planner, this algorithm provides the necessary tweak toward the optimal solution. In
this work, I have demonstrated an on-policy algorithm that has improved the quality
of the solution over vanilla reinforcement learning. The objective of this work has
been to observe the capacity of the synthesized algorithm in finding optimal policies to
maximize rewards, awareness of the environment, and the awareness of the presence
of other agents in the vicinity.

Contributors

Agent

Created

Date Created
2022