AI Server Industry Analysis: Training Products in Short Supply

baoshi.rao

After the explosive popularity of ChatGPT, major tech companies have intensified their focus on AI large models, causing a shortage of general-purpose computing chips.

In the era of large models, the AI server market has surged, with training products in high demand.

The latest report from International Data Corporation (IDC), titled China Semi-Annual Accelerated Computing Market (First Half of 2023) Tracker, reveals that the accelerated server market reached $3.1 billion in the first half of 2023, a 54% year-over-year increase compared to the first half of 2022. GPU servers remain dominant, accounting for 88% of the market share ($3 billion). Meanwhile, non-GPU accelerated servers, such as NPU, ASIC, and FPGA, grew by 17% year-over-year, capturing 8% of the market share ($200 million).

From a vendor sales perspective, the report shows that Inspur, H3C, and Ningchang ranked in the top three in the first half of 2023, holding over 70% of the market share. In terms of server shipments, Inspur, Kunqian, and Ningchang took the top three spots, accounting for nearly 60% of the market share. By industry, the internet remains the largest procurement sector, representing over half of the total accelerated server market. Additionally, the finance, telecommunications, and government sectors each saw growth exceeding 100%.

The demand for AI servers has skyrocketed. Analysts point out that the AI era is surging forward, with massive data driving enormous computing power needs, leading to a daily increase in AI server demand. This has also spurred growth in interface chips for data transmission within and outside servers.

ZTE: Latest AI Server Supporting Large Model Training to Be Released This Year

ZTE stated on an interactive platform that, as a leading domestic server manufacturer, the company has actively responded to the demands of the AI field by promptly launching server products tailored for various AI application scenarios. For example: (1) To meet the training and inference needs of small and medium-sized models, the company introduced the G5 series servers in January this year, including heterogeneous computing servers for intelligent computing with 10-20 built-in heterogeneous computing acceleration engines. (2) For large model training and inference, the company will focus on next-generation intelligent computing center infrastructure products, supporting large model training and inference, including high-performance AI servers and DPUs. Among these, the latest AI server supporting large model training will be released within the year.

An AI server is a data server capable of providing artificial intelligence (AI) services. It can support local applications and web pages, as well as deliver complex AI models and services for cloud and on-premises servers. AI servers help provide real-time computing services for various AI applications.

AI Server Industry Analysis

AI servers primarily adopt two architectures: a hybrid architecture that stores data locally and a cloud-based architecture that uses remote storage technologies and hybrid cloud storage (a combination of local and cloud storage) for data storage.

The cost of AI servers mainly comes from chips like CPUs and GPUs, accounting for over 50% of the total cost.

Servers consist of components such as power supplies, CPUs, memory, hard drives, fans, and optical drives. Chip costs (CPUs, GPUs, etc.) account for a significant portion, ranging from 25% to 70%.

AI servers use heterogeneous configurations, which can be categorized into combinations like CPU+GPU, CPU+FPGA, and CPU+ASIC. Currently, GPUs remain the preferred choice for data center acceleration, while the use of other non-GPU chips is gradually increasing. IDC predicts that non-GPU chips will account for over 20% of the market by 2025. Generally, ASICs offer the best performance but lack programmability and flexibility. For training or general purposes, GPUs are the better choice.

China ranks among the world leaders in AI server technology. AI servers with CPU+accelerator chip architectures offer efficiency advantages in model training and inference. Unlike the monopolistic landscape of foreign AI chip manufacturers, China's AI server capabilities are at the forefront globally.

According to IDC data, in 2022, Inspur led China's AI server market (by sales) with a 46.6% share, followed by H3C and Ningchang at 11% and 9%, respectively.

Taking the Inspur NF5688M6 AI server as an example, it is priced at approximately 1.05 million RMB on JD.com, including 2 Intel IceLake processors (about 53,000 RMB each according to cnBeta) and 8 NVIDIA A800 GPUs (about 104,000 RMB each according to ZOL). The value proportion of CPUs and GPUs accounts for 10.10% and 79.24% respectively.

China's AI server industry chain consists of upstream core components and software supply, midstream AI server manufacturers, and downstream application markets including internet companies, cloud computing enterprises, data center service providers, government departments, financial institutions, healthcare sectors, and telecom operators.

According to Du Yunlong, an IDC China AI infrastructure analyst, domestic chip manufacturers currently lag behind international advanced levels in terms of technical capabilities, and their ecosystem development is relatively incomplete. However, the supply-demand relationship in the chip sector is gradually changing. Many enterprises are shifting from 'international procurement' to 'local procurement' or 'self-developed and self-used', creating favorable conditions for the development of China's chip industry.