Distributed Learning in Virtual Realistic Environment
Kaiyue (Karin) Wu, Weichao Qiu, Yuan Jing Vincent Yan
Reinforcement learning in Virtual Realistic Environment can be rather slow restricted by the computation capability of the machine if we run the learning algorithm and interactions with environment all on a single one. This project develops a distributed system to speed up the learning process by dividing the learning task and parallelizing onto multiple machines.
Our system has a two-stage speed-up architecture. First we separate out the part of the interactions with environment, run multiple interactions on different machines at the same time instead of originally only one on the same machine where the learning algorithm is running. Since in virtual realistic environment the interactions are a lot slower than the speed of learning algorithm, this way we are able to make the interactions keep up with the speed of learning. Second we parallelize the learning process, run the learning algorithm on multiple machines at the same time. This way we are able to let each machine do a part of job and get overall speed-up. We basically have two versions for this stage, one is centralized with a parameter server collecting data from all the learning processes (according to Tensorflow Distributed System), and the other one is a P2P version where multiple learning processes directly communicate with each other through multicasting.
Then we investigate the problem of applying our system to physical environments with an easy-to-get robotic arm. The real arm can be virtualized into an arm model, and we are able to gather data from the physical environments through the virtual model after analyzing the model with computer vision techniques, then our system can be applied to process these data and perform learning tasks. Hence we proved that our system has the potential capability to be applied to real environments. Some future work may focus on actually achieving this goal.
For more details, see our presentation slides as pdf here.
The code and documentation for this project can be found here: code documentation