Domestic AI Computing Power Demand to Maintain Growth Momentum, AI Server Shipments Expected to Increase by 38.4% Annually

baoshi.rao

AI servers are data servers capable of providing artificial intelligence (AI) services. They can support local applications and web pages, as well as deliver complex AI models and services for cloud and on-premises servers. AI servers help provide real-time computing services for various AI applications. There are two main architectures for AI servers: a hybrid architecture that stores data locally, and a cloud-based architecture that uses remote storage technologies and hybrid cloud storage (a technology combining local and cloud storage) for data storage.

AI Server Shipments Expected to Increase by 38.4% in 2023

The AI industry chain typically consists of upstream data and computing power layers, midstream algorithm layers, and downstream application layers. Recently, the market has focused more on the upstream industry chain, particularly the computing power sector. Many new investment opportunities have emerged in AI hardware, as AI software applications rely on the computing power provided by hardware.

Under the continuous catalysis of ChatGPT, domestic AI computing power demand will maintain its growth momentum, benefiting computing power server manufacturers. It is estimated that ChatGPT's total computing power requires 7 to 8 data centers with an investment scale of 3 billion yuan and 500P computing power to support its operation. In the era of digital economy, global data volume and computing power scale will show rapid growth.

Printed circuit boards (PCBs), known as the 'mother of electronic products,' are important components of servers. With the rapid development of the artificial intelligence (AI) industry, the market demand for high-value PCB products used in AI servers has significantly increased.

With the simultaneous surge in demand for AI servers and AI chips, it is expected that the shipment volume of AI servers (including those equipped with GPUs, FPGAs, ASICs, and other main chips) will reach nearly 1.2 million units in 2023, a year-on-year increase of 38.4%, accounting for nearly 9% of total server shipments. By 2026, this proportion is expected to further rise to 15%. The institution has also revised the compound annual growth rate (CAGR) of AI server shipments from 2022 to 2026 to 22%, while the shipment volume of AI chips is expected to grow by 46% in 2023.

The agency stated that NVIDIA GPUs have become the mainstream chips for AI servers, holding a market share of approximately 60-70%, followed by ASIC chips independently developed by cloud computing providers, which account for over 20%.

Compared to general-purpose servers, AI servers utilize multiple accelerator cards with high-layer HDI PCB structures, significantly increasing their value. The motherboard layers in AI servers far exceed those in general servers, making AI server PCBs 5-6 times more valuable than standard server PCBs.

At NVIDIA Computex 2023, founder and CEO Jensen Huang announced that the generative AI engine NVIDIA DGX GH200 has entered mass production. Demonstrations revealed significant architectural changes in the new GH200 server compared to the DGX H100. The GH200 reduces one UBB and one CPU motherboard while adding three NVLink module boards, alongside substantial performance improvements in accelerator cards. This suggests increased per-unit PCB value, indicating that AI advancements will continue driving value growth in the PCB sector.

2023 AI Server Industry Chain Demand Outlook

PCBs serve as the foundation for server chips, handling data transmission and connecting various components. In AI servers, due to chip upgrades, PCBs require targeted improvements to process more signals, reduce interference, enhance heat dissipation, and improve power management, necessitating further optimization in materials and processes. Compared to general-purpose servers, AI servers use multiple accelerator cards, with PCBs adopting high-layer HDI structures, resulting in higher value. Additionally, the motherboard layers in AI servers far exceed those in general servers, making AI server PCBs 5-6 times more valuable. According to Prismark data, the global PCB market for servers was valued at $7.804 billion in 2021 and is projected to reach $13.294 billion by 2026, with a compound annual growth rate (CAGR) of 11.2%. The per-unit PCB value for servers is expected to rise from $576 in 2021 to $705 in 2026. With the deployment of large AI models and applications, demand for AI servers is increasing, signaling imminent market expansion.

The rise in AI server shipments is expected to directly benefit related chip, memory, and component suppliers.

For server chips, TrendForce predicts a 46% increase in AI server chip shipments in 2023. NVIDIA GPUs dominate the AI server market as the mainstream chips, holding a market share of approximately 60%-70%, followed by self-developed ASIC chips by cloud service providers, which account for over 20% of the market.

AI servers will drive the need for synchronous upgrades in ServerDRAM, SSDs, and HBM (High Bandwidth Memory). The institution is particularly optimistic about the growth potential of HBM, stating that with increasing demand for high-end GPUs such as NVIDIA's A100 and H100, AMD's MI200 and MI300, as well as Google's self-developed TPUs, HBM demand is projected to grow by 58% year-on-year in 2023, with an estimated further 30% growth in 2024.

TrendForce notes that with more customers adopting HBM3 this year, SK Hynix, as the sole supplier of the new-generation HBM3 products, is expected to increase its overall HBM market share to 53%. Meanwhile, Samsung and Micron are projected to account for 38% and 9% of the HBM market share, respectively, by late this year to early next year.

As global tech giants engage in an arms race for large-scale models, coupled with rapid growth in model parameters, the shift from single-modal to multi-modal upgrades, the continuous enrichment of vertical industry models, the emergence of downstream applications, and the expanding user base, the demand for AI computing power is expected to sustain its growth trajectory.

The continuous emergence and performance upgrades of large models are driving the high demand for AI computing power. According to OpenAI's calculations, the computing power used for the largest AI training runs has grown exponentially since 2012, doubling every 3-4 months. From 2012 to 2018, the computing power required for AI training runs increased by over 300,000 times (compared to only a 7-fold increase under Moore's Law). Additionally, OpenAI's 2020 data shows that training a GPT-3 model with 174.6 billion parameters once requires approximately 3,640 PFlop/s-day (if computing at one quadrillion operations per second, it would take 3,640 days). In the future, as global tech giants increase their investments in large models, and as these models rapidly expand in parameter size and evolve from single-modal to multi-modal capabilities, along with growing user bases, the demand for AI computing power is expected to remain robust.

In 2022, China's total computing power scale reached 180 EFLOPS, with total storage capacity exceeding 1000 EB. The one-way network latency between national hub nodes was reduced to within 20 milliseconds, and the core computing power industry scale reached 1.8 trillion yuan. On April 19, the Shanghai Municipal Commission of Economy and Informatization issued the "Shanghai Guidelines for Promoting Unified Scheduling of Computing Resources." By the end of 2023, the plan aims to connect and schedule over 4 computing infrastructure facilities, achieving schedulable intelligent computing power of more than 1,000 PFLOPS (FP16). By 2025, the goal is to realize cross-regional intelligent scheduling of computing power, promote balanced supply and demand through efficient scheduling, and significantly enhance industrial development. Shanghai's data center computing power will exceed 18,000 PFLOPS (FP32), with green computing power accounting for over 10% of newly built data centers. The comprehensive PUE of newly constructed large data centers in cluster areas will be reduced to below 1.25, achieving a green and low-carbon rating of 4A or higher.

From an industrial perspective, global tech giants are currently accelerating their computing power strategies. Software companies are developing their own chips, while hardware manufacturers are building computing platforms. On one hand, software and internet giants such as Microsoft, Amazon, Google, and Facebook are increasing investments in self-developed AI chips. Similarly, leading Chinese internet companies like Alibaba, Tencent, and Baidu have also disclosed plans for in-house AI chip development. On the other hand, chip manufacturers represented by Intel are beginning to establish computing platforms, focusing on software and cloud services. At this year's GTC (NVIDIA Developer Conference), NVIDIA officially launched its AI cloud service, DGX Cloud, enabling enterprises to instantly access the infrastructure and software required for training groundbreaking applications like generative AI. DGX Cloud provides dedicated NVIDIA DGX AI supercomputing clusters paired with NVIDIA AI software. This service allows any enterprise to access NVIDIA's AI supercomputers via a web browser, eliminating the complexity of purchasing, deploying, and managing on-premises infrastructure. The minimum subscription price for DGX Cloud starts at $37,000 per month.

As a crucial foundation for the development of the digital economy, computing power resources are witnessing diversified application scenarios driven by new digital technologies, business models, and innovations. The continuous expansion of computing power scale has led to a sustained rise in demand. According to data from the Ministry of Industry and Information Technology, the total scale of operational data center racks in China exceeded 6.5 million standard units in 2022, with an average annual growth rate of over 25% in computing power scale over the past five years.

As computing power is applied across various industries, different precision levels of computing need to 'adapt' to diverse application scenarios. Particularly with the rapid advancement of artificial intelligence technology, the structure of computing power is evolving, and the demand for intelligent computing is increasing daily.

From a policy perspective, China places high importance on the development of the AI industry, gradually solidifying the foundation for intelligent computing growth. In February 2022, four ministries jointly issued a notice approving the construction of national computing hub nodes in eight regions and planning ten national data center clusters. This marks the completion of the overall layout design for the national integrated data center system. With the full implementation of the 'East Data West Computing' project, the construction of intelligent computing centers has entered a new phase of accelerated development.

As the hub and carrier of data and applications, data centers form the foundation for AI development. In the long term, the demand for data centers is expected to recover. It is projected that the IDC market will reach 612.3 billion yuan by 2024, with a compound annual growth rate of 15.9% from 2022 to 2024, signaling a new upward cycle for data centers.