Carnegie Mellon University
April 22, 2022

Robots That Can Learn to Safely Navigate Warehouses

Advances in chips, sensors and AI algorithms are enabling robots to continuously learn how to plan routes, avoid obstructions and operate safely in large dynamic warehouse environments

Robots have been working in factories for years. But given the complex and diverse tasks they perform and related safety concerns, most of them operate inside cages or behind safety glass to limit or prevent interaction with humans.

In warehouse operations, where goods are continuously sorted and moved, robots can be neither caged nor stationary.

“For robots to be most useful in a warehouse, they will need to be smart enough to deploy in any facility easily and quickly; able to train themselves to navigate in new dynamic environments; and most importantly, be able to safely work with humans as well as sizeable fleets of other robots,” says Ding Zhao, principal investigator and assistant professor of mechanical engineering in the College of Engineering.

At Carnegie Mellon University, a team of engineers and computer scientists have employed their areas of expertise in advanced manufacturing, robotics and artificial intelligence to develop the warehouse robots of the future.

The collaboration was formed at the university’s Manufacturing Futures Institute (MFI), which funds such research with grants from the Richard King Mellon Foundation. The foundation made a lead $20 million grant in 2016 and gave an additional $30 million in May 2021 to support advanced manufacturing research and development at MFI.

Zhao and Martial Hebert, dean of CMU’s School of Computer Science, are leading the warehouse robot project. They have investigated multiple reinforcement learning techniques that have shown measurable improvements over previous methods in simulated motion planning experiments. The software used in their test robot has also performed well in path-planning experiments at Mill 19, MFI’s collaborative workspace for advanced manufacturing.

“Thanks to the advance in chips, sensors and advanced AI algorithms, we are at the cusp of revolutionizing the manufacturing robots,” Zhao says. The team leverages previous work in self-driving cars to develop warehouse robots that can learn multitask path planning via safe reinforced learning — training robots to quickly adapt to new environments and operate safely with workers and human-operated vehicles.

Zuxin Liu, a College of Engineering doctoral student at CMU’s Safe AI Lab, operates the intelligent manufacturing logistic robot.

MAPPER: Learning to plan their own pathways

The group first developed a method that could enable robots to continuously learn to plan routes in large, dynamic environments. The Multi-Agent Path Planning with Evolutionary Reinforcement (MAPPER) learning method will allow the robots to explore by themselves and learn by trial and error in a manner similar to the way human babies accumulate more experience to handle various situations over time.

The robots make independent decisions based on their own local observations instead of being programmed from a central command computer. Their partially observable capabilities will enable their onboard sensors to observe dynamic obstacles with an up-to-30-meter range, and with reinforced learning, the robots will continually train themselves how to handle unknown, dynamic obstacles.

Such smart robots can enable warehouses to employ large fleets of robots more easily and quickly. As the computation is done with each robot’s onboard resources, the computation complexity will only increase mildly as the robot number increases — making it easier to add, remove or replace the robots.

Energy consumption also could also be reduced when robots travel shorter travel distances because they are able to independently plan their own efficient paths.

“For robots to be most useful in a warehouse, they will need to be smart enough to deploy in any facility easily and quickly; able to train themselves to navigate in new dynamic environments; and most importantly, be able to safely work with humans as well as sizeable fleets of other robots.”

Ding Zhao, principal investigator and assistant professor of mechanical engineering, College of Engineering

RCE: Prioritizing safety in pursuit of programmed goals

Another successful study applied the use of a constrained model-based reinforcement learning with the Robust Cross-Entropy (RCE) method.

Researchers must explicitly consider safety constraints for a learned robot, so that it does not sacrifice safety in order to finish tasks. For example, the robot needs to avoid colliding with other robots, damaging goods or interfering with equipment in order to reach its goal.

“Although reinforcement learning methods have achieved great success in virtual applications like computer games, there are still a number of difficulties in applying them to real-world, robotic applications. Among them, safety is premium,” Zhao says.

Creating such safety constraints that are factored in at all times and conditions goes beyond traditional reinforcement learning methods into the increasingly important area of safe reinforcement learning. 

The team evaluated their new RCE method in the Safety Gym, a set of virtual environments and tools for measuring progress. The results showed that their approach enabled the robot to learn to complete its tasks with a much smaller number of constraint violations than state-of-the-art baselines.

Assistant Professor of Mechanical Engineering in the College of Engineering Ding Zhao (right) and his students demonstrate their warehouse robot’s capabilities to Pennsylvania legislators Joe Pittman (far left), Josh Kail, Natalie Mihalek, Ryan Aument and Patrick Stefano.

CASRL: Adapting to current conditions by learning

To further address how robots can navigate safely in typical warehouse environments where people and other robots are moving freely, the group employed the use of the Context-Aware Safe Reinforcement Learning (CASRL) method to address what they call non-stationary disturbances.

In addition to workers or other robots moving around a warehouse, the CASRL method also enables the robots to learn how to safely navigate other situations that could include inaccurate sensor measurements, broken parts or obstructions such as trash. The team also applies CASRL to manipulation of tools and interaction with humans, which can be directly applied to assembly in manufacturing.

“Non-stationary disturbances are everywhere in real-world applications, providing infinite variations of scenarios,” says Zuxin Liu, a doctoral student at CMU’s Safe AI Lab. “An intelligent robot should be able to generalize to unseen cases rather than just memorize the examples provided by humans. This is one of the ultimate challenges in trustworthy AI.”

Zhao explains that the robot must learn to determine whether the previously trained planning policies are still suitable for the current situation. The robot then updates policy based on the recent local observations in an online training fashion, so that it could be easily adapted to new situations with unseen disturbances, while also guaranteeing safety with high degree of probability.

Based on its past sensing data, the robot can automatically infer and model the potential disturbances and update its planning policy. Zhao’s team further extends the method to continuously learn to solve unseen tasks with online reinforcement learning that not only is able to adapt to unseen, yet similar tasks, but also to identify and learn to solve distinct tasks.

“The future of the next generation of manufacturing is now,” Zhao says.