Google Co-releases SayCan Model for Bots to Give Sensible Answers

YOU can try an AI chatbot as GOOD as LaMDA - Google's "sentient" AI
What makes LLMs so good is the sheer volume of information that these models draw from the large corpus of text extracted by the web during training.
Given the power of LLM understanding, does this mean that a robot can communicate with humans just as well and perform tasks just as well if it performs a variety of language-based processing tasks directly?
The answer is no, because LLM is not based on the physical world, and it does not work without observing and influencing its physical surroundings. This means that some of the answers given by the LLM are sometimes incompatible and impractical with the surrounding environment.
Figure | Different feedback given by different large language models
and the new SayCan model (right) when the user makes the same request (source: arXiv)
For example, in the example shown above, a human gives a kitchen robot, which can only perform basic operations such as "pick up a kitchen utensil" and "move to a location", the question "I spilled my drink, can you help? "
After this request, three well-known large language models gave answers that did not fit the scenario: GPT3 responded with "You need a vacuum cleaner" and LaMDA responded with "Can I help you find a vacuum cleaner?" FLAN replied, "Sorry, I didn't mean to spill my drink".
As you can see, the LLM was unable to provide the most appropriate response directly to the bot because the response was not contextualized in the surrounding environment.
In order to make language systems such as robots more in tune with their physical surroundings and thus more effective in helping humans, Google Robotics, in conjunction with Everyday Robotics, has developed a new language processing model, called SayCan.
This model is trained to not only learn how to understand language commands well and give answers but also to evaluate the likelihood that each answer will actually happen in the current physical environment so that the robot can "do what it says".
Recently, a paper entitled "Do As I Can, Not As I Say: Grounding Language in Robotic Affordances" was published on the arXiv.
In brief, the SayCan model extracts results from a large language model in a physical environment-based task, and it consists of two main components.
First, a large language model in the Say section performs the task of understanding the meaning of the language and providing appropriate answers to help solve the problem.
Then, the Can part evaluates the answers, i.e., the "available functions", to determine what behavior is feasible to perform at this time in the context of the physical environment.
Here, researchers use Reinforced learning (RL) to learn and train linguistically conditioned value functions that determine the feasibility of behavior in the current environment.
Specifically, the SayCan model abstracts the problem as follows: the system first receives a natural language instruction from the user, which also gives the task to be performed by the robot, and which can be long, abstract, or even ambiguous.
The system also predetermines a set of skills Π that the robot has, where each skill π ∈ Π is a decomposed short task, such as picking up a particular object. Each skill has its own short linguistic description lπ, e.g., "find a knife and fork", and its own availability function p(cπ |s, lπ ), which represents the probability of successfully achieving the skill described as lπ from state s.
In layman's terms, the availability function p(cπ |s, lπ) is the probability of successful completion of the skill π with a description labeled lπ in state s, where cπ is a Bernoulli random variable. In RL, p(cπ |s, lπ) is also the value function of the skill, such that the reward is set to 1 if it can be successfully completed and 0 otherwise.
The algorithm and idea of the SayCan model to solve the problem are shown below.
Figure | Algorithm of SayCan model (Source: arXiv)
To validate the SayCan model performance, the researchers proposed two main metrics for evaluation. The first metric is the planning success rate, which measures whether the answers given by the model match the instructions; the feasibility of the skill in the current environment is not considered here.
The second metric is the execution success rate, which measures whether the system is actually able to successfully execute and complete the tasks required by the instructions.
Figure|Evaluation results (Source: arXiv)
The researchers had the model perform 101 tasks and showed that in the simulated kitchen task, the SayCan model had an 84 percent planning success rate and a 74 percent execution success rate. In the evaluation conducted in a real kitchen environment, SayCan's planning success rate decreased by 3 percent and the execution success rate decreased by 14 percent compared to the simulated kitchen.
Figure|SayCan Example of performing other tasks (Source: arXiv)
Returning to the example mentioned above, when faced with the user's command "I spilled my drink, can you help?" Unlike other LLM models, SayCan responds by "1. finding a rag, 2. picking up the rag, 3. bringing it to the user, and 4. finishing". This allows the robot to help the user better than other models.
Reference:
https://arxiv.org/abs/2204.01691
Related News
1、MediaTek, Qualcomm announce joining Russia sanctions
2、Automotive chips rose across the board!
3、Apple M1 Ultra -- The Technology Behind the Chip Interconnection
4、Foxconn Announces Investment of $9 Billion to Build A Chip Factory in Saudi Arabia
5、Japanese Companies Increase Investment in Power Semiconductors
- UTMEL 2024 Annual gala: Igniting Passion, Renewing BrillianceUTMEL18 January 20243011
As the year comes to an end and the warm sun rises, Utmel Electronics celebrates its 6th anniversary.
Read More - Electronic Components Distributor Utmel to Showcase at 2024 IPC APEX EXPOUTMEL10 April 20243844
Utmel, a leading electronic components distributor, is set to make its appearance at the 2024 IPC APEX EXPO.
Read More - Electronic components distributor UTMEL to Showcase at electronica ChinaUTMEL07 June 20242471
The three-day 2024 Electronica China will be held at the Shanghai New International Expo Center from July 8th to 10th, 2024.
Read More - Electronic components distributor UTMEL Stands Out at electronica china 2024UTMEL09 July 20242694
From July 8th to 10th, the three-day electronica china 2024 kicked off grandly at the Shanghai New International Expo Center.
Read More - A Combo for Innovation: Open Source and CrowdfundingUTMEL15 November 20193606
Open source is already known as a force multiplier, a factor that makes a company's staff, financing, and resources more effective. However, in the last few years, open source has started pairing with another force multiplier—crowdfunding. Now the results of this combination are starting to emerge: the creation of small, innovative companies run by design engineers turned entrepreneurs. Although the results are just starting to appear, they include a fresh burst of product innovation and further expansion of open source into business.
Read More
Subscribe to Utmel !
- AIF04ZPFC-01L
Artesyn Embedded Power
- TC33X-2-201E
Bourns Inc.
- 3361S-1-503GLF
Bourns Inc.
- BP5063-5
ROHM Semiconductor
- 3252L-1-502
Bourns Inc.
- 3362X-1-101LF
Bourns Inc.
- 3352W-1-202LF
Bourns Inc.
- IRM-03-5S
MEAN WELL USA Inc.
- 3352E-1-205LF
Bourns Inc.
- 3299Y-1-504LF
Bourns Inc.