[Text/Technical and Industrial Power Columnist Tieliu]
Recently, Loongson released an independent instruction system (LoongArch). When domestic CPU companies are rushing to introduce x86, ARM, Power, SPARC, RISC-V and other instruction sets , Loongson launched an independent instruction system architecture that seemed extremely unique.
In the past few years, domestic CPU companies have introduced CPUs from Intel , AMD, IBM, ARM, VIA, Qualcomm and other companies, but have never been able to build their own Wintel. The reason is that they always maintain a follow-up mentality and lack the determination and perseverance to be independent. With the changing international environment, especially after the education of Trump and Biden, it has become a consensus to build an independent and controllable information technology system and industrial ecosystem. It is timely for Loongson to issue the independent instruction system architecture at this time.
Loongson's independent instruction system is a brand new instruction set
CPU instruction system is the computer's hardware and software interface and is the binary encoding format specification for the software instructions executed by the CPU.
At present, the instruction sets that once had a certain influence in the international world include X86, MIPS, ARM, Power, Alpha, SPARC, RISC-V, etc. These instruction sets are all imported products. Only LoongArch and SW64 are truly independently developed by China. SW64 is the instruction set of Shenwei CPU, developed independently by domestic units. The chip SW26010 of Shenwei Taihu Light Supercomputer is designed based on SW64. In the past, LoongISA developed based on the MIPS instruction set adding instructions. This time, LoongISA has nothing to do with MIPS and is a completely independent new instruction set.
MIPS is the world's first commercial RISC instruction set. Due to its "long history", some outdated content in the instruction system that is not suitable for the current development trend of software and hardware design technology. Loongson abandons the criticized parts of the traditional instruction system and absorbs many advanced technological development achievements in the field of instruction system design in recent years. For example, the number of immediates supported by a single instruction is expanded from the maximum 16 bits of MIPS to the maximum 24 bits, the branch jump offset is also expanded from 64K to 1M bytes, and the addressing space is changed from a fixed segment to a single plane, which effectively reduces the number of target instructions and the number of memory fetches in the compilation result, and improves performance.
Because the LoongArch instruction design is more optimized, it even has a slight advantage over x86 in terms of the number of instructions after compiling the source code into the target program. In Coremark's test, the total number of instructions executed during the program run was 83% of MIPS, which is equivalent to a 20% increase in operational efficiency. In tests with more diverse types, combined test results, LoongArch is 12% faster than MIPS on average, indicating that the newly designed LoongArch is successful and can bring significant performance improvements to the CPU.
In addition, LoongArch was designed to fully consider the compatibility of ecological needs and integrate the main functional characteristics of various international mainstream instruction systems. In addition to running native LoongArch programs, it can also be compatible with Linux programs such as MIPS, x86, ARM, and RISC-V through translation. According to the official PPT, when translating X86, the operating efficiency can reach 80%.
In 2020, LoongArch commissioned a domestic third-party well-known intellectual property evaluation institutions to conduct in-depth and meticulous intellectual property evaluation of LoongArch and in-depth comparison and analysis of LoongArch with ALPHA, ARM, MIPS, POWER, RISC-V, X86 and other major international directive systems and tens of thousands of patents. In January 2021, the evaluation agency believed that:
1) LoongArch independently designed the instruction system design, instruction format, instruction encoding, addressing mode, etc. (2) The LoongArch instruction system manual has obvious differences from the above-mentioned major international instruction systems in terms of chapter structure, instruction description structure and instruction content expression. (3) No risk of infringement of LoongArch infrastructure on the above-mentioned major international directive systems was found.
two dimensions of independent CPU: independent instruction system, independent completion of front-end and back-end design
For a long time, whether it is a domestic CPU jointly ventured with foreign investors, a domestic CPU licensed by foreign technology, or a domestic CPU that packages foreign CPUs and their ASICs, they all claim to be in line with their own main qualities, and promote independent research and development to ask the government for policies and markets. Local governments have given green lights and given policies to these so-called domestic CPUs based on local and short-term interests. However, with the huge investment of funds, the CPU introduced by these technologies not only failed to bear fruit, but also caused many mistakes and even failed. The key to these situations is that the definition of independent CPUs is vague, and local governments spend money randomly and blindly introduce technology.
Ironflow believes that the autonomous CPU must be based on autonomous instructions and complete the CPU design independently.
First look at the autonomous command system. In the past, some domestic manufacturers claimed that they had obtained ARM v8 license, and some ARM CPU supporters therefore said that domestic ARM CPUs meet the requirements of their own. Half a month ago, ARM released its next-generation chip architecture, ARM v9, and claimed to be the most important innovation in 10 years and the basis for the 300 billion ARM chip in the future. After the release of ARM v9, it is a question of where to go for ARM CPU manufacturers that purchase ARM v8 authorized.
It is true that reports about some domestic manufacturers being able to continue to purchase ARM v9 instruction set authorization are flooding the Internet, but just check the source of the news, you can find that the source is a foreign media report in 2019. Domestic media use the reports from 2019 to splice dolls. Looking at the webpage on the official website of ARM v9, the partners at the end of the article include Google, Nvidia , NXP, Fujitsu, Red Hat, and other foreign companies, as well as TSMC, MediaTek, OPPO, VIVO, Xiaomi and other Chinese companies. However, some domestic manufacturers are not on the list of partners.
Take a step back and say that even if I luckily bought the ARM v9 license this time, then, ARM will release V10, V11, V12 in the future... Do domestic ARM CPU companies still need to continue to buy V10, V11, V12 licenses... If it is this kind of "buying endlessly", then how can the "independence" that domestic ARM CPUs claim to start?
Therefore, autonomous CPUs must be based on autonomous instruction systems. The CPU developed based on ARM authorization has not solid foundation and is building a house on the beach. Autonomy is beyond the reach.
followed by independent design. The independent design here includes the independent completion of front-end design and back-end design. The chip design to the die is basically divided into two parts: the front end and the back end. The front end of
is RTL design. According to the design specification, the design is done, the verilog code is formed, and then the functional verification is used by eeda tool, and iterative modification is repeated until the test is passed. The backend design is divided into two parts, logic design and physical design. logic design accepts the front-end Verilog file, uses the synthesis tool to generate a gate-level netlist, and then uses the eeda tool to do logic equivalence check, and iterates until it passes. physical design accepts gate-level netlists to generate physical layouts using place&route software, and uses tools to perform physical verification on layouts, including RC extraction and post-layout verification, etc., and iterates until it passes. After passing, GDSII is generated and the foundry stream is sent, called tape-out.
At present, it is normal for the industry to purchase various IPs and design outsourcing from abroad. For example, the CPU cores and GPU cores of Huawei Kirin chips and Unigroup Tiger chips are basically purchased from foreign companies such as ARM and Imagination. For example, Feiteng outsourcing the back-end design to Shixin. The reason why
buys IP and design outsourcing is that it is that it has a poor foundation, its technology is not solid, and it is unwilling to improve its technical level step by step, and wants to achieve results as soon as possible. Technology improvement requires gradual progress. The source code replacement of front-end design generation products generally does not exceed 25%, and must evolve from generation to generation.Custom module designs in back-end design usually flow the chip first to verify the function. There is no problem. Then it is integrated into the chip as a module or IP to ensure that it is not prone to errors. Therefore, the back-end requires experienced people, which are accumulated by using money and flow, as well as time to learn. In order to seize the dividends of national policies, some domestic CPU companies will naturally choose to purchase IP or outsourcing designs to achieve results as soon as possible. After using outsourced IP or outsourcing the design,
will bring about the problem of poor technical foundation on the one hand. One of the most obvious phenomena is the lack of succession and insufficient development potential. For example, in the improvement of CPU IPC, the technology introduction of CPU is obviously not as good as Loongson, so it can only rely on the use of more advanced TSMC processes to improve performance. Another is that it will inevitably bring huge political risks, and once sanctions are encountered, the consequences will be unimaginable. Recently, Feiteng was included in the entity list by the United States, and Feiteng's back-end design was outsourced to Shixin. Since Shixin was listed on the island of Taiwan, China, and Feiteng is Shixin's largest customer, accounting for about 39% of its performance last year, Shixin held an online briefing meeting as soon as possible. According to official news from Shixin, after Feiteng is included in the entity list, the subsequent shipment of Shixin, which provides the company with the final stage of design, will be hindered. The cutting-edge process chips commissioned by Feiteng to design and mass-produce by Shixin have been suspended from accepting orders by TSMC and will be decided after the subsequent investigation of the situation.
Conclusion
A instruction system carries a software ecosystem, such as the Wintel system formed by the X86 instruction system and the Windows operating system, and the AA system formed by the ARM instruction system and the Android operating system. In order to connect with the X86 and ARM ecosystem, Zhaoxin can make it possible to sell X86 chips in the mainland Chinese market through a joint venture. Huawei , Feiteng, and Huaxintong use the purchase instruction set authorization to obtain ARM authorization, but the facts have proved that the roads of Huawei, Feiteng, and Huaxintong are not feasible. On the foundation of ARM, high-rise buildings with independent technical systems cannot be built. Foreign CPU manufacturers use instruction systems as a means to control the ecosystem and need to obtain "authorization" to develop compatible CPUs. Products can be developed using authorization and instruction systems, but it is impossible to form an independent industrial ecosystem. Just as Chinese people can write novels in English, but it is impossible to form Chinese nation culture based on English.
instruction system is the starting point of the software ecosystem. Only by realizing autonomy from the root of the instruction system can we break the chain of the development of the software ecosystem be controlled by people. The launch of Loongson's independent instruction system is the result of Loongson's long-planned project and is by no means a "mortgage open source" product used by some manufacturers to deal with crisis public relations. Because 3A5000 is designed based on LoongArch and has samples, it will be launched on the market in 2021. It will take a long cycle from determining a new instruction set to designing a CPU based on the new instruction set, and then completing the streaming. From Loongson's initial addition instructions based on MIPS, to the development of LoongISA based on MIPS, to the latest LoongArch, Loongson's purpose is very clear and obvious, that is, to do everything possible to gain the dominance and unswervingly follow the path of independence.
It must be noted that Loongson, like Huawei and Feiteng, is difficult for it to withstand the US ban at the moment.
It is true that the main reason for Huawei and Feiteng ARM chips out of print is the loss of TSMC's current chip channel, but both rely on overseas technical input in design. Huawei's Kunpeng chip design is largely due to Huawei's research institute in the United States, while Feiteng outsources it to Shixin in the back-end design, which is very dangerous because the United States can easily terminate technical input. It can be said that Huawei and Feiteng are all bored in the three aspects of ARM authorization, CPU design, and CPU manufacturing. Although Loongson has implemented independent instruction system and CPU independent design, it does not need to obtain technology from foreign research institutes, nor does it need to outsource the backend design to overseas manufacturers like Shixin, it is also fragile in the flow channel. Due to being restricted by others in many aspects such as semiconductor equipment, materials, and EDA, the entire industrial chain is still not available in China, and the US entity list is exactly "100-1=0".
At the moment, the most unfavorable thing is the navy's self-sustainment, powerful body, and boiling body. Tieliu used to oppose intracranial congestion and boiling body, and was also accused of "not standing up."
What is funny is that when Tieliu advocated low-key and gradual construction of the entire industrial chain, those who accused Tieliu of "not standing up" also accused Tieliu of being unrealistic and should "integrate into the international mainstream."
Tieliu couldn't help but ask, who couldn't stand up!
Some people must overcome the "slave" mentality. When evaluating domestic CPUs, they should look at whether the domestic CPU itself is good, and they cannot compare with whose "foreign father" to develop better.
Practice proves that "integrating into the international mainstream" is not a good idea. Trump and Biden have already explained it all with their actions. At present, Loongson has designed the 3A5000 based on LoongArch, and the 3A5000 based on the 12nm process surpasses the 7nm Kunpeng CPU in the most critical single-core performance. This is a milestone and a milestone in the use of relatively backward processes for independent technology to surpass the introduced technology.
When Loongson has already achieved independent independence in the design process, it is hoped that domestic raw material manufacturers, equipment manufacturers, manufacturers, and EDA manufacturers can be powerful. It is hoped that software manufacturers can actively transplant adaptive application software, and ultimately achieve the completion of the entire chain of chip design, manufacturing, packaging, testing, raw materials, EDA, as well as OS, middleware, database, and application software.