Dr. Bo Xiao-Medical Robotics Shanghai Jiao Tong University

Dr. Bo Xiao

2019-11-14

Biography

Bo Xiao received Ph.D. degree from the Department of Informatics, King's College London, U.K. in 2018. He is currently a research associate with the Hamlyn Centre for Robotic Surgery and the Department of Computing, Imperial College London, U.K. During the period 2017 to 2018, he worked as a research fellow at Advanced Robotics Centre and the Department of Biomedical Engineering, National University of Singapore, Singapore.

His current research interests include medical robotics, learning from demonstration, robotic suturing, fuzzy-model-based control systems, interval type-2 fuzzy logic, polynomial control systems, machining learning and reinforcement learning.

He has been the guest editor for IEEE Transactions on Fuzzy Systems and IET Control Theory and Applications. He has served as an active reviewer for a number of peer-reviewed journals including IEEE Transactions on Fuzzy Systems, IEEE Transactions on Automatic Control, IEEE Transactions on Cybernetics, Fuzzy Sets and Systems and IET Control Theory and Applications.

Abstract

Learning from Demonstration is increasingly used for transferring operator manipulation skills to robots. In practice, it is important to cater for limited data and imperfect human demonstrations, as well as underlying safety constraints. In this speech, a constrained-space optimization and reinforcement learning scheme for managing complex tasks is presented. Through interactions within the constrained task space, the reinforcement learning agent is trained to optimize the manipulation skills according to a defined reward function. After learning, the optimal policy is derived from the well-trained reinforcement learning agent, which is then implemented to guide the robot to conduct tasks that are similar to the experts' demonstrations. The effectiveness of the proposed method is verified with a robotic suturing task, demonstrating that the learned policy outperformed the experts' demonstrations in terms of the smoothness of the joint motion and end-effector trajectories, as well as the overall task completion time.

Links