Look, Google's latest paper - Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models.

2025/05/0204:06:34 hotcomm 1106

Baijiao from Aofei Temple

qubit | Official account QbitAI

A AI paper, 442 authors.

also has a chapter dedicated to the writer's contribution. More than half of the pages of

00 are references...

is not, are such papers popular now?

, Google 's latest paper - Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models.

So the author's column became like this...

Look, Google's latest paper - Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models. - DayDayNews

Research scholars from 132 institutions took two years to propose a new benchmark for a large language model BIG-bench.

and on this basis, the GPT model of OpenAI, Google-internal dense transformer architecture, etc. were evaluated, and the model scale was 6 orders of magnitude. The final result of

shows that although the model performance improves with the expansion of scale, it is still far from that of humans.

For this work, Jeff Dean forward and like: Great Work.

Look, Google's latest paper - Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models. - DayDayNews

new benchmark for the big language model

What exactly does Kangkang's paper say?

As the scale of the scale expands, the performance and quality of the model have improved to a certain extent. There may be some transformational impacts in this, but these performances have not been well described before. Some existing benchmarks in

have certain limitations, the evaluation range is relatively narrow, and the performance scores quickly reach saturation.

, such as SuperGLUE, has achieved "over human-level" performance within 18 months after the benchmark was launched.

Look, Google's latest paper - Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models. - DayDayNews

BIG-bench was born based on this background.

Currently it consists of 204 tasks, covering issues such as linguistics, child development, mathematics, common sense reasoning, biology, physics, social bias, software development, etc.

Look, Google's latest paper - Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models. - DayDayNews

In addition, there is a human expert jury that also performed all tasks to provide a baseline level.

To facilitate more institutions, the researchers also gave BIG-bench Lite, a small but representative subset of tasks that facilitate faster evaluation.

Look, Google's latest paper - Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models. - DayDayNews

and open source code implementing the benchmark API, support task evaluation on publicly available models, and lightweight creation of new tasks. The final evaluation results of

can be seen that the scale spans six orders of magnitude, and the overall performance on BIG-bench increases with the expansion of the model scale and the increase in the number of training samples.

, but compared with the human baseline level, it still performs poorly.

Look, Google's latest paper - Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models. - DayDayNews

Specifically for some tasks, the model performance will improve steadily with the increase in scale. But sometimes, breakthroughs suddenly appear on a specific scale.

Look, Google's latest paper - Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models. - DayDayNews

In addition, it can also evaluate the social biases present in the model.

Look, Google's latest paper - Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models. - DayDayNews

In addition, they also accidentally discovered that the model can also get some hidden skills. For example, how to move in chess in a rule.

Look, Google's latest paper - Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models. - DayDayNews

Author contributions wrote 14 pages of

It is worth mentioning that perhaps because of too many authors, a chapter of the author's contributions were left in the end of the paper.

wrote 14 pages in a slew-inspired manner, including core contributors, Review, tasks providing...

Look, Google's latest paper - Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models. - DayDayNews

, and 50 pages of reference.

is OK. Interested friends can click below to link Kangkang's paper.

paper link:
https://arxiv.org/abs/2206.04615
GitHub link:
https:// github.com/google/BIG-bench
Reference link:
https://twitter.com/jaschasd/status/1535055886913220608

— End —

Quantum bits QbitAI · Toutiao Sign

Follow us and learn about cutting-edge technology dynamics

hotcomm Category Latest News