Meta AI's OPT-175B has 175 billion parameters, which is comparable to business language models such as OpenAI's GPT-3. This access-level limitation not only greatly hinders researchers' understanding of this type of large language model and its principles, but also raises the thr

2025/06/2614:36:35 hotcomm 1282

Meta AI's OPT-175B has 175 billion parameters, which is comparable to business language models such as OpenAI's GPT-3. This access-level limitation not only greatly hinders researchers' understanding of this type of large language model and its principles, but also raises the thr - DayDayNews

Meta AI's OPT-175B has 175 billion parameters, which is comparable to commercial language models such as OpenAI's GPT-3. Recently, Meta AI announced that it will fully open OPT-175B. This means that large-scale language models are becoming popular.

In the past few years, large-scale language models, that is, natural language processing (NLP) systems containing more than 100 billion parameters, have changed the entire NLP and even AI research trend. These models have been trained in massive text materials and have shown surprising abilities in generating creative texts, solving basic math test questions, and answering reading comprehension problems.

Although the public could have interacted with some such models through paid APIs, its complete research and access rights are still only a few resource-rich laboratories. This access-level limitation not only greatly hinders researchers' understanding of this type of large language model and its principles, but also raises the threshold for participating in known issues such as improving model robustness and alleviating bias/"toxic" concepts.

Meta AI's OPT-175B has 175 billion parameters, which is comparable to business language models such as OpenAI's GPT-3. This access-level limitation not only greatly hinders researchers' understanding of this type of large language model and its principles, but also raises the thr - DayDayNews

Based on the open scientific commitment made by Meta AI, we are determined to share the Open Pretrained Transformer (OPT-175B) model. This is a language model composed of 175 billion parameters and trained by public data set . It hopes to help more communities participate and understand this basic technical achievement.

This is also the first time in history that large-scale language technology systems have unreserved and have presented all pre-trained models, training codes and usage codes to the public.

To maintain model integrity and prevent abuse, we will publish this model in a non-commercial license, hoping that OPT-175B will be used only for research purposes . Specifically, access to this model will be fully open to academic researchers, including personnel affiliated with governments, civil society and academic organizations, as well as industrial research laboratories around the world.

We firmly believe that the entire AI community, composed of academic researchers, civil society, policy makers and industry, has been working hard to create responsible AI solutions. This basic idea should also become a guideline for large-scale language models, and thus constrain more downstream specific applications centered on large-scale language models.

AI It is necessary for community members to access these models, conduct reproducible research and jointly push the entire field forward. With the release of OPT-175B and small-scale benchmarks, we hope to introduce new diversity ideas to the solution of this technical ethical puzzle.

Responsible Responsible OPT-175B

Following the release guide developed by Partnership on AI for researchers, combined with the overview of the governance guide proposed by NIST in March 2022 (Section 3.4), we decided to publish all notes and records in the OPT-175B development process, including a complete log with a detailed introduction to the daily training process.

This way, other researchers can easily continue to work hard on our work basis and make other meaningful extensions. In addition, based on these details, you can understand the overall training calculation amount of the OPT-175B model; and how much manpower is needed to adjust when there are large-scale stability fluctuations in the underlying infrastructure or training process.

In addition to OPT-175B ontology, we have also released a model training and deployment code base that can run on 16 Nvidia V100 GPUs, hoping to improve accessibility of these models. To help everyone do a good job in research, we also propose a set of general measurement indicators for quantifiable potential harm.

In addition, we have also fully released a set of smaller-sized benchmark models, which use the same training data set and parameter settings as OPT-175B ontology, allowing researchers to explore the actual impact of model size differences separately.

The parameters of these small-scale models are divided into 125 million, 350 million, 1.3 billion, 2.7 billion, 6.7 billion, 13 billion and 30 billion. We will release 66 billion parameters in the future.

Responsibly calculates

AI The latest developments in research consume a lot of computing power.Although industry laboratories are already reporting the carbon footprints generated by various models, most of them do not include the relevant computational costs during the experimental R&D stage. In some cases, the initial stages can consume an order of magnitude more than training the final model.

When we developed OPT-175B, we also fully considered energy efficiency factors, and finally successfully completed the model training with only one-seventh of GPT-3 carbon emissions. We combined Meta's open source full sharded data parallelism (FSDP) API with Nvidia's tensor parallel abstraction in Megatron-LM to achieve this feat.

We achieved ultra-high utilization of approximately 147 TFLOP/s/GPU on Nvidia 80 GB version of the A100 GPU, a result that is about 17% higher than the data published by Nvidia researchers on equivalent hardware.

Through the code base, we share these valuable 175B model training benchmarks, hoping to help more researchers reduce their overall carbon footprint, and hope to measure the latest achievements and cutting-edge progress in the AI ​​field with this unified standard.

promotes research and development through open cooperation

In order to promote AI research, the entire academic community must carry out extensive cooperation with cutting-edge models, and discover the "weakness" in it while quickly exploring the potential of the model. Like our previous open science programs such as the Image Similarity Challenge, the Deepfake Detection Challenge, and the Hateful Mems Challenge, Meta AI believes that only such cross-organizational collaboration can help us step by step toward truly responsible approaches to AI development.

Although the field of large language model has brought about a series of exciting development results, the limitations and risk factors of these models themselves are still not effectively grasped. Without direct access to these models, it will be difficult for researchers to plan feasible injuries detection and mitigation strategies for them. In other words, detection and mitigation capabilities will be completely mastered by researchers with sufficient financial resources.

We hope that OPT-175B's openness can introduce more perspectives to the cutting-edge exploration of large language models, help community clusters design responsible publishing strategies, and ultimately bring unprecedented transparency and openness to the development of large-scale language models.

Click here to access open source code and small pretrained models;

Click here to apply for access to OPT-175B model;

Click here to read the original text of the paper.

Each pretrained model follows the OPT-175B license agreement.

Original link:

https://ai.facebook.com/blog/democratizing-access-to-large-scale-language-models-with-opt-175b/

Learn more about software development and related fields, click to visit InfoQ official website: https://www.infoq.cn/ to get more exciting content!

hotcomm Category Latest News