Large Models Drive Surge in AI Server Demand, Impact Spreads to Hardware Market

baoshi.rao

The hype around large models has gradually subsided after major players unveiled their products, but the high-frequency resonance they triggered in the industrial chain has now reached the computing power layer.

The most intense manifestation is seen in the AI server market. The computing power demand driven by large models has directly sparked a wave of AI server purchases and price surges.

According to Securities Times, a testing company revealed that the eight AI servers they purchased in June last year had risen to 1.3 million yuan per unit by March this year, and the price has now skyrocketed to 1.6 million yuan per unit. In less than a year, the price has increased nearly 20-fold.

Additionally, the surge in AI server demand has directly triggered a rush for the upstream material PPO (polyphenylene oxide, used as a reinforcement material for high-speed copper-clad laminates). Industry insiders have admitted that since there is only one major global PPO manufacturer, PPO is likely to become one of the bottlenecks in the supply chain as AI server production scales up.

Against this backdrop, news of AI server manufacturers expanding production has also been rampant.

For example, Foxconn's AI server subsidiary, Ingrasys Technology, was reportedly planning to add five to six production lines to meet the demands of AI server clients.

The market fervor is evident, and it has directly ignited the capital markets.

Since January, AI server-related stocks led by Inspur Information, InnoLight Technology, and Foxconn Industrial Internet have soared, with multiple limit-up rallies. Even long-term loss-making companies like Cambricon have seen their stock prices surge.

The Booming 'AI Server'

What is an AI Server?

An AI server is a high-performance server specifically designed for compute-intensive tasks such as artificial intelligence (AI), machine learning (ML), and deep learning (DL).

AI servers are typically equipped with high-performance central processing units (CPUs), graphics processing units (GPUs), tensor processing units (TPUs), or dedicated AI accelerators, along with ample memory and storage.

In terms of heterogeneous configurations, AI servers can combine CPUs with GPUs, FPGAs, TPUs, ASICs, or multiple accelerator cards.

The specific design and configuration can be adjusted based on the requirements of the parallel processing tasks at hand.

Currently, the most widely used AI servers employ a CPU+GPU architecture, which distinguishes them from traditional servers.

Traditional servers primarily rely on CPUs as computing power providers. However, their operation involves numerous branch jumps and interrupt handling, resulting in complex internal structures that cannot meet the demands of the AI era.

In contrast, AI servers utilizing GPU parallel computing feature thousands of cores per card, excelling at processing compute-intensive applications such as graphics rendering, computer vision, and machine learning.

The mentioned inspection companies' AI servers come with a basic configuration including 8 NVIDIA A100 GPUs and 80GB of memory.

AI servers are particularly useful for compute-intensive tasks in AI, ML, and DL. Their main functions include:

Big Data Processing: AI servers can handle and analyze massive datasets, which is crucial for training AI and ML models.

Parallel Computing: Since AI and ML algorithms require complex computations on large datasets, AI servers typically use hardware like GPUs that can process massive data in parallel.

Storage and Memory: AI servers are equipped with substantial storage space and memory to store and process large volumes of data.

Network Capability: AI servers require high-speed, low-latency network connections to quickly transfer large amounts of data.

This explains why the AI server market has seen a buying frenzy following the rise of large language models. These models contain massive data parameters, and both training and operation require more computing resources, necessitating higher-performance AI servers for support.

While the immediate cause of this surge in AI server demand is the advent of the large model era, the current explosion in AI server usage is actually related to advancements in both AI technology and big data.

In summary, the popularity of AI servers can be attributed to several key factors:

First, the rise of big data. Every corner of modern society, whether social media, e-commerce, or internet search, is generating massive amounts of data.

These data require analysis and interpretation through complex algorithms to uncover useful patterns and information, and AI servers provide the necessary computational power to handle these tasks.

Secondly, the widespread adoption of AI and ML has also fueled the demand for AI servers. AI and ML are now extensively applied across various industries, including healthcare, finance, retail, and transportation.

Advancements in these fields require robust computing power to process and analyze data, as well as to train and run complex AI and ML models.

Finally, the development of cloud computing and edge computing has also contributed to the surge in AI server popularity. Cloud computing enables enterprises and organizations to access powerful computing capabilities without purchasing and maintaining expensive hardware, while edge computing requires data processing and analysis on servers located close to where the data is generated.

AI Server Domestic Market Landscape

The AI server market has been growing steadily over the past few years. Today, with the support of large models, the AI server market is expanding significantly.

According to the latest data released by Beijing Research Intelligence, global AI server shipments reached 850,000 units in 2022, a year-on-year increase of approximately 11%. By mid-2023, AI server shipments approached 600,000 units, representing a year-on-year growth of about 39%.

Looking ahead, with the development of large AI models for natural language processing, images, videos, and other applications, along with the continuous growth in computing power demand, the global AI server market size is expected to exceed $20 billion by the end of this year.

By 2025, market shipments are projected to rise to approximately 1.9 million units, with an average annual growth rate of 41.2% during the 2022-2025 period.

In terms of the specific industry chain, the upstream of the AI server industry chain includes core components such as CPUs, GPUs, memory, and hard drives, as well as software supplies like databases, operating systems, and basic management software. The downstream consists of application markets, including internet services, cloud computing, and data center providers.

Currently, the market is dominated by major AI server manufacturers such as Huawei, Inspur, Lenovo, and Sugon, whose servers are widely used in AI and ML research and commercial applications.

However, it is worth noting that Inspur Information recently released a semi-annual performance forecast indicating declines in both revenue and net profit.

Inspur Information reported an 88%-99% year-on-year decline in non-GAAP net profit for the first half of 2023. The company attributed this to decreased revenue caused by global GPU and specialized chip supply shortages.

Industry analysts note that despite the AI server boom, Inspur's underperformance stems from the broader downturn in traditional server markets, with AI servers still constituting a small portion of its business. The company has stated that AI server operations are growing proportionally, with potential revenue impacts likely visible in its 2023 annual report.

According to IDC's 2022 Q4 China Server Market Preliminary Report, Inspur maintains a leading 28.1% market share (down from 30.8% in 2021) across all server segments. This reflects how traditional CPU servers are being eclipsed by AI-driven heterogeneous servers.

Globally, Microsoft led 2022 AI server procurement with nearly 20%, followed by Google (17%), Meta (15%), and AWS (14%). Domestically, ByteDance's AI server purchases surged to 6% market share amid China's accelerated AI infrastructure development.

Challenges include rising energy consumption despite performance gains, and the need for continuous R&D to keep pace with AI/ML advancements. Future growth is expected from expanding AI/ML/DL applications and 5G/IoT-driven edge computing demands.

Overall, although the market faces some challenges, the rapid development and widespread application of AI servers demonstrate that this is a vibrant and promising market.