News Express|Google Robotics Team won the ICRA 2023 Robot Learning Direction Bes
1. Google Robotics Team Wins Best Paper Award in Robot Learning at ICRA 2023: Large Language Model Programs for Robotic Physical Control
This paper primarily discusses the application of Large Language Models (LLMs) in robotic control. The authors note that while LLMs excel at understanding and generating natural language, their application in practical fields, such as robotic control, remains limited. Therefore, they propose a new method of using LLMs to write code that controls the behavior of robots.
They found that LLMs that write code perform well in planning, strategic logic, and control. These models can be reused to write robotic strategy code, providing natural language commands (formatted as comments). Strategy code can express functions or feedback loops that handle perceptual outputs (e.g., open-vocabulary object detectors) and parameterize control primitive API calls. When provided with several example language commands and corresponding strategy code (via few-shot prompting), LLMs can receive new commands and autonomously recombine API calls to generate new strategy code. Additionally, code-writing models can express various arithmetic operations and language-based feedback loops. They can not only generalize to new instructions but also, due to training on billions of lines of code and comments, specify precise values for vague descriptions (e.g., "faster" and "left") based on context to elicit behavioral common sense.
Advertisement
In the methodology section, the authors detail how to use Large Language Models (LLMs) to generate code as a strategy. Their approach主要包括 the following steps:
1. Define Language Model Programs (LMPs): The authors first define the concept of Language Model Programs (LMPs). LMPs are any programs generated by language models and executed on a system. Their work focuses on a category of LMPs known as "code as strategy," which maps language instructions to code snippets that can (i) respond to perceptual inputs (i.e., from sensors or modules above sensors), (ii) parameterize control primitive APIs, and (iii) compile and execute directly on the robot.
2. Generate LMPs: The authors demonstrate how to use LLMs to generate LMPs. They provide examples of how to transform natural language instructions (formatted as comments) into code. For instance, they show how to use LLMs to write code to control robot behaviors, such as moving objects, recognizing objects, and performing more complex tasks.
3. Execute LMPs: To execute LMPs, they first check if it is safe to run, ensuring there are no import statements, no special variables starting with __, and no calls to exec and eval. Then, they use Python's exec function to take the code as an input string and form the scope of that code's execution with two dictionaries: (i) global variables, containing all APIs that the generated code might call, and (ii) local variables, an empty dictionary that will be populated with variables and new functions defined during exec. If the LMP is expected to return a value, they retrieve it from the local variables after exec is completed.
4. Prompt Generation of LMPs: Prompts for generating LMPs contain two elements: (i) prompts, such as import statements, informing the LLM which APIs are available and how to use these APIs; (ii) examples, which are instruction-to-code pairs showing how to transform natural language instructions into code. These may include performing arithmetic operations, calling other APIs, and other features of the programming language.
5. Advanced LMPs: The authors also demonstrate how to use LLMs to generate more complex code, such as using control flow (e.g., if-else and loop statements) and nested function calls. They also show how to use LLMs to generate functions for future use and how to follow good abstraction practices with LLMs, avoiding "flattening" all code logic.
In the experimental section of this paper, the authors show how to use Large Language Models (LLMs) to write code that controls the behavior of robots. They provide examples, including how to use LLMs to write Python scripts, how to perform complex operations with third-party libraries, and how to operate with first-party libraries. They also demonstrate how to write more complex code with LLMs, such as using control flow (e.g., if-else and loop statements) and nested function calls.In the experiment, they utilized specific tasks to demonstrate the effectiveness of their approach. For instance, they showcased how to use LLM to write code for moving objects, how to recognize objects, and how to perform more complex tasks. They also demonstrated how to generate functions with LLM for future use and how to follow good abstraction practices with LLM to avoid "flattening" all code logic.
Here are some concrete experimental examples:
1. They demonstrated how to write code with LLM to move objects. For example, they showed how to write code with LLM to move an object named "red block." They first obtained the object's position and then moved it to the right by a certain distance.
2. They demonstrated how to write code with LLM to recognize objects. For example, they showed how to write code with LLM to identify an object named "blue block." They employed an open-vocabulary object detector to accomplish this task.
3. They demonstrated how to write code with LLM to perform more complex tasks. For instance, they showed how to write code with LLM to place an object named "blue block" on another object named "blue bowl."
4. They demonstrated how to generate functions with LLM for future use. For example, they showed how to write code with LLM to define a function named "get_total" that takes a parameter named "xs" and returns its sum.
5. They demonstrated how to follow good abstraction practices with LLM to avoid "flattening" all code logic. For example, they showed how to write code with LLM to define a function named "get_objs_bigger_than_area_th" that takes two parameters—a list of object names named "obj_names" and a threshold named "bbox_area_th"—and then returns the names of all objects larger than that threshold.
These experimental results indicate that LLM can be effectively used to write code controlling robot behavior, with high practicality and broad application prospects.
In conclusion, the authors summarized their research findings and provided an outlook on future research directions. They believe that the capabilities of Large Language Models (LLM) in writing code offer new possibilities for robot control. By using LLM, we can transform natural language commands into robot strategy code, thereby achieving more complex robot behaviors. Moreover, they also pointed out that LLM is not only capable of understanding and generating natural language but can also participate in human-robot dialogue and Q&A by using "say(text)" as an available action primitive API. Their research results show that LLM can effectively write Python programs and handle various complex tasks, such as moving objects, recognizing objects, and performing more complex tasks. Their approach is not only widely applicable to robot control but also to other fields that require code writing.
2. Estun: Increased capital injection into its subsidiary Estun Robotics by 450 million yuan.On August 1, Estun announced that the company plans to use its own funds of 450 million yuan to increase its capital in its wholly-owned subsidiary, Nanjing Estun Robot Engineering Co., Ltd., with 300 million yuan added to the registered capital and 150 million yuan to the capital reserve.
Estun announced that the company intends to transfer 10% of its shares in Estun Medical to its controlling shareholder, Pailaist, at a price of 24 million yuan. After the transfer, the company will still hold 16.68% of the shares in Estun Medical. This transaction aims to optimize the company's asset structure, focus on the development of its main business, and is expected to have a positive impact on the company's performance in 2024, with a pre-tax profit impact of approximately 15 million yuan.
Estun plans to use its own funds of 450 million yuan to increase the capital of its wholly-owned subsidiary, Estun Robot, with 300 million yuan added to the registered capital and 150 million yuan to the capital reserve. After the capital increase, the registered capital of Estun Robot will increase from 150 million yuan to 450 million yuan, and the company will still hold 100% of its shares.
This capital increase aims to enhance the capital strength and operational capabilities of the subsidiary, which is in line with the company's strategic development plan. The capital increase is within the approval authority of the company's board of directors and does not need to be submitted to the shareholders' meeting for review. It does not constitute a related transaction or a major asset restructuring.
The announcement shows that in 2023, Estun Robot achieved a business income of 1.657 billion yuan, with a net profit of 29 million yuan; in the first quarter of 2024 (unaudited), it achieved a business income of 400 million yuan, with a net profit of 11 million yuan.
Estun stated that this capital increase to Estun Robot is to further meet the operational development needs of the company and its subsidiary, enhance its capital strength, reduce the asset-liability ratio, and strengthen the robot industry, which is in line with the company's strategic development plan.
Estun Robot was established in 2011, with Wu Kan as its legal representative. Its business scope includes the development, production, and sales of related products (including FTL flexible production line manufacturing, vertical multi-joint industrial robots, welding robots and their welding equipment), equipment, and engineering integration projects mainly based on robots and industrial robot complete systems, as well as providing related supporting services; self-managed and agency import and export of various goods and technologies (except for goods and technologies that the state restricts the company to operate or prohibits import and export), etc.