Home /The blog /other /Google Co-releases SayCan Model for Bots to Give Sensible Answers /

Google Co-releases SayCan Model for Bots to Give Sensible Answers

Published: 30 August 2022 | Last Updated: 30 August 20221688

In AI, the field of Large Language Models (LLMs) has seen particularly dramatic growth in recent years, with trained LLMs capable of generating complex text in response to prompts, answering difficult questions, and even starting conversations on a topic.

YOU can try an AI chatbot as GOOD as LaMDA - Google's "sentient" AI

What makes LLMs so good is the sheer volume of information that these models draw from the large corpus of text extracted by the web during training.

Given the power of LLM understanding, does this mean that a robot can communicate with humans just as well and perform tasks just as well if it performs a variety of language-based processing tasks directly?

The answer is no, because LLM is not based on the physical world, and it does not work without observing and influencing its physical surroundings. This means that some of the answers given by the LLM are sometimes incompatible and impractical with the surrounding environment.

Figure | Different feedback given by different large language models

and the new SayCan model (right) when the user makes the same request (source: arXiv)

For example, in the example shown above, a human gives a kitchen robot, which can only perform basic operations such as "pick up a kitchen utensil" and "move to a location", the question "I spilled my drink, can you help? "

After this request, three well-known large language models gave answers that did not fit the scenario: GPT3 responded with "You need a vacuum cleaner" and LaMDA responded with "Can I help you find a vacuum cleaner?" FLAN replied, "Sorry, I didn't mean to spill my drink".

As you can see, the LLM was unable to provide the most appropriate response directly to the bot because the response was not contextualized in the surrounding environment.

In order to make language systems such as robots more in tune with their physical surroundings and thus more effective in helping humans, Google Robotics, in conjunction with Everyday Robotics, has developed a new language processing model, called SayCan.

This model is trained to not only learn how to understand language commands well and give answers but also to evaluate the likelihood that each answer will actually happen in the current physical environment so that the robot can "do what it says".

Recently, a paper entitled "Do As I Can, Not As I Say: Grounding Language in Robotic Affordances" was published on the arXiv.

In brief, the SayCan model extracts results from a large language model in a physical environment-based task, and it consists of two main components.

First, a large language model in the Say section performs the task of understanding the meaning of the language and providing appropriate answers to help solve the problem.

Then, the Can part evaluates the answers, i.e., the "available functions", to determine what behavior is feasible to perform at this time in the context of the physical environment.

Here, researchers use Reinforced learning (RL) to learn and train linguistically conditioned value functions that determine the feasibility of behavior in the current environment.

Specifically, the SayCan model abstracts the problem as follows: the system first receives a natural language instruction from the user, which also gives the task to be performed by the robot, and which can be long, abstract, or even ambiguous.

The system also predetermines a set of skills Π that the robot has, where each skill π ∈ Π is a decomposed short task, such as picking up a particular object. Each skill has its own short linguistic description lπ, e.g., "find a knife and fork", and its own availability function p(cπ |s, lπ ), which represents the probability of successfully achieving the skill described as lπ from state s.

In layman's terms, the availability function p(cπ |s, lπ) is the probability of successful completion of the skill π with a description labeled lπ in state s, where cπ is a Bernoulli random variable. In RL, p(cπ |s, lπ) is also the value function of the skill, such that the reward is set to 1 if it can be successfully completed and 0 otherwise.

The algorithm and idea of the SayCan model to solve the problem are shown below.

Figure | Algorithm of SayCan model (Source: arXiv)

To validate the SayCan model performance, the researchers proposed two main metrics for evaluation. The first metric is the planning success rate, which measures whether the answers given by the model match the instructions; the feasibility of the skill in the current environment is not considered here.

The second metric is the execution success rate, which measures whether the system is actually able to successfully execute and complete the tasks required by the instructions.

Figure｜Evaluation results (Source: arXiv)

The researchers had the model perform 101 tasks and showed that in the simulated kitchen task, the SayCan model had an 84 percent planning success rate and a 74 percent execution success rate. In the evaluation conducted in a real kitchen environment, SayCan's planning success rate decreased by 3 percent and the execution success rate decreased by 14 percent compared to the simulated kitchen.

Figure｜SayCan Example of performing other tasks (Source: arXiv)

Returning to the example mentioned above, when faced with the user's command "I spilled my drink, can you help?" Unlike other LLM models, SayCan responds by "1. finding a rag, 2. picking up the rag, 3. bringing it to the user, and 4. finishing". This allows the robot to help the user better than other models.

Reference:

https://say-can.github.io/

https://arxiv.org/abs/2204.01691

2、Automotive chips rose across the board!

3、Apple M1 Ultra -- The Technology Behind the Chip Interconnection

4、Foxconn Announces Investment of $9 Billion to Build A Chip Factory in Saudi Arabia

5、Japanese Companies Increase Investment in Power Semiconductors

UTMEL

We are the professional distributor of electronic components, providing a large variety of products to save you a lot of time, effort, and cost with our efficient self-customized service. careful order preparation fast delivery service

UTMEL 2024 Annual gala: Igniting Passion, Renewing Brilliance
UTMEL18 January 20243046
As the year comes to an end and the warm sun rises, Utmel Electronics celebrates its 6th anniversary.
Read More
Electronic Components Distributor Utmel to Showcase at 2024 IPC APEX EXPO
UTMEL10 April 20243880
Utmel, a leading electronic components distributor, is set to make its appearance at the 2024 IPC APEX EXPO.
Read More
Electronic components distributor UTMEL to Showcase at electronica China
UTMEL07 June 20242506
The three-day 2024 Electronica China will be held at the Shanghai New International Expo Center from July 8th to 10th, 2024.
Read More
Electronic components distributor UTMEL Stands Out at electronica china 2024
UTMEL09 July 20242728
From July 8th to 10th, the three-day electronica china 2024 kicked off grandly at the Shanghai New International Expo Center.
Read More
Power Up Your Savings this Christmas - Utmel's Exclusive Holiday Deals
UTMEL30 October 2025288
As the holiday season approaches, Utmel is thrilled to announce our most generous Christmas promotion of the year. We understand that sourcing quality electronic components while managing costs is crucial for your business success. That's why we've designed a special tiered savings program that rewards you for thinking bigger this holiday season.
Read More

Subscribe to Utmel !

Your Name