Last week, Meta announced an AI-powered audio compression method called "EnCodec", which is said to compress audio to 64kbps 10 times smaller than the MP3 format and has the same quality. Meta says this technology can greatly improve voice quality on low-bandwidth connections, su

2025/07/0902:28:36 news 1506

Last week, Meta announced an artificial intelligence -driven audio compression method called "EnCodec", which is said to compress audio to 64kbps 10 times smaller than the MP3 format and has the same quality. Meta says this technology can greatly improve voice quality on low-bandwidth connections, such as making phone calls in areas with imperfect services. This technology is also suitable for music.

Last week, Meta announced an AI-powered audio compression method called

Meta first introduced the technology on October 25 in a paper titled "High-fidelity neural audio compression" with the authors of Meta's artificial intelligence researchers Alexandre Défossez, Jade Copet, Gabriel Synnaeve and Yossi Adi. Meta also summarizes the research on its blog dedicated to EnCodec.

Meta describes its method as a three-part system that is trained to compress audio to the desired target size. First, the encoder converts the uncompressed data into a "latent space" representation of the lower frame rate . The "quantizer" then compresses the notation to the target size while tracking the most important information, which will then be used to reconstruct the original signal. (This compressed signal will be sent over the network or saved on disk). Finally, the decoder uses an neural network to convert compressed data into audio in real time on a single CPU.

Last week, Meta announced an AI-powered audio compression method called

A block diagram illustrates how EnCodec compression of Meta works

Meta's use of discriminators proves to be the key to creating a way to compress as much audio as possible without losing the key elements of the signal, making it unique and recognizable.

"The key to lossy compression is to identify changes that cannot be perceived by humans, as perfect reconstruction is not possible at low bit rates. To do this, we use a discriminator to improve the perceived quality of the generated samples. This creates a cat and mouse game where the discriminator's job is to distinguish between real samples and reconstructed samples. The compression model attempts to generate samples to deceive the discriminator by pushing the reconstructed samples to be more perceptually similar to the original samples."

It is worth noting that using neural networks for audio compression and decompression is far from new, especially when used for voice compression, but Meta researchers claim they are the first working group to apply the technology to 48kHz stereo audio (slightly better than CD's 44.1kHz sampling rate), the most typical music file that is spreading on the internet.

As for application, Meta said that this "hypercompressed audio" powered by artificial intelligence can support "faster and better-quality calls" under harsh network conditions. Of course, as a Meta company, the researchers also mentioned the metadata impact of EnCodec, saying that the technology could ultimately provide a "rich metadata experience without requiring a substantial increase in bandwidth."

Apart from that, maybe one day we can get smaller music audio files from it. Meta's new technology is still in the research stage, but it points to a future where high-quality audio can use less bandwidth, which is good news for mobile broadband providers that are overburdened by streaming.

news Category Latest News