245.7 billion parameters! The world's largest AI giant model king comes out, how strong is Inspur "Source 1.0"?

On September 28, Inspur Artificial Intelligence Research Institute released the artificial intelligence massive model "Source 1.0" in Beijing, which is by far the world's largest artificial intelligence massive model. It is reported that its parameter volume is 245.7 billion, and the Chinese data set used for training is 5000GB. Compared with the 175 billion parameter volume and 570GB training data set of the US GPT-3 model, the source 1.0 parameter scale is 40% ahead of the training data set scale. Leading nearly 10 times and ranking first in the world.

algorithm, data, and computing power have achieved super-large scale

. Giant quantification.

First of all, in terms of algorithms, compared to the English language model GTP-3 with 175 billion parameters, "Source 1.0" contains a total of 245.7 billion parameters, which is 1.404 times the amount of the former. And most importantly, "Source 1.0" is a single model like GPT-3, instead of being piled up by many small models. In this aspect alone, "Source 1.0" can be ranked as the world's largest natural language understanding model.

Secondly, in terms of data, "Source 1.0" has almost read the vast contents of the entire Chinese Internet in the past five years. Through the self-developed text classification model, a 5TB high-quality Chinese data set was obtained, which is nearly 10 times ahead of the training data set size. "Source 1.0" also read about 200 million words. what is this concept? If a person can read ten books a month, a hundred books a year, and 50 years of reading, he will read 5,000 books in his lifetime. If a book is 200,000 words, it will only add up to 1 billion words, and it will take 10,000 years to read. Over 200 billion words. With such a large-scale data blessing, the "Source 1.0" data set has naturally become the world's largest high-quality Chinese data set.

In addition,In terms of computing power, "Source 1.0" consumes a total of about 4095PD (PetaFlop/s-day). Compared with GPT-3 consumes 3640PD calculations to obtain 175 billion parameters, the calculation efficiency is greatly improved. If "Source 1.0" is allowed to "read" 24 hours a day, it only takes 16 days to read almost all the contents of the Chinese Internet in the past five years.

is the world’s largest, among the strongest in the world, refreshed a number of world records

CLUE is currently recognized as the most authoritative Chinese language model evaluation benchmark, "source 1.0" occupies its zero-shot and small samples Learning (few-shot) is the top of the two lists. In the zero-sample learning list, "Source 1.0" surpasses the industry's best score by 18.3%, in document classification, news classification, product classification, native Chinese reasoning, and idiom reading comprehension Fill in the blanks and the relationship between nouns and pronouns won the championship; in the small sample learning document classification, product classification, document summary identification, noun pronoun relationship and other 4 tasks won the championship. In the idiom reading comprehension fill-in project, the performance of Yuan 1.0 has surpassed the human score.

At the same time, in the "Turing Test" of "Source 1.0", the dialogues, novel continuations, news, poems, couplets generated by the source 1.0 model and similar works created by humans Mixed and distinguished by the crowd. The test results show that the success rate of the crowd being able to accurately distinguish the difference between people and "Yuan 1.0" works has been lower than 50%.

At present, the wave "source 1.0" large model is just the beginning. It only provides a broad fertile soil and provides unified and powerful algorithm support for the generalization support of many application tasks. In the future, Inspur Source's "Source 1.0" large model will promote innovative companies and individual developers to build higher levels of intelligent scenarios based on the large model, empower the intelligent upgrade of the real economy, and promote the high-quality development of the digital economy.

.