| Summary | Real-world applications including robot vacuums, tour-guide robots and delivery robots employ autonomous navigation systems as the core function. However, constructing such a system could be costly, as such systems typically incorporate high-cost sensors such as depth cameras or LIDARS. We present an effective, easy-to-implement, and low-cost modular framework for completing complex autonomous navigation tasks. 
 In this project, we presented a straightforward, easy to implement, and effective modular framework using only a single monocular camera to handle the challenging robot navigation problem in the real world. We achieved this objective by introducing a virtual guidance scheme, which employs the virtual guide to navigate the robot’s policy to its destination. We performed extensive experiments in diverse indoor and outdoor maps, and verified that our method is robust to various environmental conditions and generalizable to unfamiliar maps both in the virtual and real-world tasks.
 | 
            
            
                        
                | Scientific Breakthrough | 1. A straightforward, easy to implement, and effective modular learning-based framework for dealing with the challenging robot navigation problem in the real world.2. An architecture that separates the vision-based robotic learning model into a perception module, a local controller module, a planner module, and a localization module.
 3. A concept of altering the behavior policy of a robot by adjusting the meta-state representation of it.
 4. A virtual guidance scheme for transforming the instruction provided by the planner module to a virtual guide for navigating the robot to the target goal.
 5. An approach for balancing obstacle avoidance and path following so as to accomplish complex navigation tasks.
 | 
                                    
                | Industrial Applicability | We propose a new modular framework for addressing the reality gap in the vision domain and navigating a robot via virtual signals. Our robot uses a single monocular camera for navigation, without assuming any usage of LIDAR, stereo camera, or odometry information from the robot. The proposed framework consists of four modules: a localization module, a planner module, a perception module, and a local controller module. In the proposed framework, the role of the virtual guide is similar to a carrot (i.e., a lure) for enticing the robot to move toward a specific direction. Compared to conventional navigation approaches, our methodology is not only highly adaptable to diverse environments, but is also generalizable to complex scenarios. |