Technical Name |
Generative AI Home Service Robots with Integrated Cognitive Capabilities for Enhanced Understanding and Decision-Making |
Project Operator |
National Cheng Kung University |
Project Host |
李祖聖 |
Summary |
This system integrates LLMs, VLMs, object and skeleton recognition to achieve environmental perception, dynamic task planning, change monitoring, object rearrangement, and human–robot interaction. It enables accurate execution of natural language commands with continuous iteration to enhance the adaptability and reliability of home-service robots in complex environments. Real-time experiments validate generative AI's potential and practical value for semantic understanding, interaction, and planning. |
Scientific Breakthrough |
This system innovatively integrates zero-shot reasoning, deep multimodal fusion, and feedback, achieving a 76.6% dynamic task success rate, 70% tableware rearrangement success rate, and 74% HRI success rate in real-world settings—surpassing the 50–60% baseline—to demonstrate exceptional semantic understanding, adaptive decision-making, and interactive performance, with strong innovation and industrial potential. |
Industrial Applicability |
With mature generative AI and multimodal technologies, our system’s Dynamic Task Planner adapts workflows in changing environments for greater autonomy in human–robot collaboration and unstructured settings. The Object Rearrangement System automatically arranges tableware to international standards, and the Multimodal HRI System makes interactions more natural and diverse. Applicable to manufacturing, warehousing, home service, and medical care. |