2024 AI Chips: Intensified Competition to Reshape the AI Chip Market

baoshi.rao

As a leader in the AI field, NVIDIA boasts strong technical capabilities and a wide range of applications. Over the past year, the surge in demand for generative AI has presented it with significant growth opportunities.

According to Wells Fargo's statistics, NVIDIA currently holds a 98% market share in the data center AI market, while AMD has only 1.2% and Intel less than 1%.

Due to the high cost of NVIDIA's AI chips and supply shortages, some customers are seeking alternative products. Amidst fierce competition, NVIDIA continues to drive product development and accelerate its update cycle. Recently, ServeTheHome disclosed NVIDIA's data center product roadmap, showcasing the company's plans for the AI market with upcoming GPUs such as the H200, B100, and X100.

NVIDIA is planning to diversify its product offerings for the data center market, introducing multiple products tailored for AI computing and HPC. This strategy aims to allow different customers to purchase products that best fit their needs, thereby simplifying the chip procurement process. From the architectural diagrams, it's evident that NVIDIA will separate its product lines based on Arm and x86 architectures in the future.

H200: Supply to begin in Q2 2024 On November 13, 2023, NVIDIA announced the launch of the NVIDIA HGX H200, bringing powerful capabilities to the world's leading AI computing platform. Based on the NVIDIA Hopper architecture, this platform is equipped with the NVIDIA H200 Tensor Core GPU and advanced memory, capable of handling massive data for generative AI and high-performance computing workloads. The H200 will begin shipping to global system manufacturers and cloud service providers in the second quarter of 2024.

The NVIDIA H200 is the first GPU to feature HBM3e, which offers faster and larger memory to accelerate generative AI and large language models while advancing scientific computing for HPC workloads. With HBM3e, the NVIDIA H200 delivers 141GB of memory at 4.8 TB per second, nearly doubling the capacity and increasing bandwidth by 2.4 times compared to its predecessor, the NVIDIA A100.

NVIDIA stated that the H200 can be deployed in various types of data centers, including on-premises, cloud, hybrid cloud, and edge environments. L40S: Launching in Fall 2023

L40S is one of NVIDIA's most powerful GPUs, launching in 2023. It is designed to handle next-generation data center workloads such as generative AI, large language model (LLM) inference and training, 3D graphics rendering, scientific simulations, and more.

Compared to previous-generation GPUs like the A100 and H100, the L40S delivers up to 5x improvement in inference performance and 2x faster real-time ray tracing (RT) performance. In terms of memory, it comes equipped with 48GB of GDDR6 memory and adds support for ECC, which is crucial for maintaining data integrity in high-performance computing environments. The L40S is equipped with over 18,000 CUDA cores, which are parallel processors crucial for handling complex computational tasks. The L40S places greater emphasis on visual encoding and decoding capabilities, whereas the H100 is more focused on decoding. Although the H100 is faster, it also comes with a higher price tag.

GH200/GH200NVL: Production to Begin in Q2 2024

In August 2023, NVIDIA announced the launch of the next-generation GH200 Grace Hopper superchip, with production scheduled to begin in the second quarter of 2024. NVIDIA GH200, combining the H200 GPU and Grace CPU, integrates the Hopper architecture GPU with the Arm architecture Grace CPU, connected via NVLink-C2C. Each Grace Hopper Superchip contains 624GB of memory, including 144GB of HBM3e and 480GB of LPDDR5x memory.

GH200 and GH200NVL will utilize Arm-based CPUs and Hopper to address the training and inference challenges of large language models. GH200NVL employs NVL technology, offering superior data transfer speeds.

Additionally, the "B" series GPUs are expected to launch in the second half of 2024, replacing the previous ninth-generation Hopper GPUs. B100, B40, GB200, GB200NVL also to be launched in 2024

NVIDIA plans to introduce the x86 architecture-based B100 to succeed the H200 and the ARM architecture-based inference chip GB200 to replace the GH200. Additionally, NVIDIA has planned the B40 product to replace the L40S, offering better AI inference solutions for enterprise customers.

According to information released by NVIDIA, the company plans to launch the Blackwell architecture in 2024. The B100 GPU chip, based on this architecture, is expected to significantly enhance processing power. Preliminary evaluations indicate a performance improvement of over 100% compared to the current Hopper architecture-based H200 series. These performance gains are particularly evident in AI-related tasks, as demonstrated by the B100's proficiency in the GPT-3 175B inference performance benchmark. X100 Plan to be Released in 2025

NVIDIA has disclosed plans for the X100 chip, scheduled for release in 2025, which will expand its product range to include the X40 and GX200 for enterprise use, combining CPU and GPU functionalities in a Superchip configuration. Similarly, the GB200 is expected to follow the B100, integrating the superchip concept.

From NVIDIA's product roadmap, the AI chip market is set to undergo another major transformation in the next 1-2 years. In the AI chip domain where Nvidia holds an absolute advantage, AMD is one of the few companies capable of producing high-end GPUs for training and deploying AI. The industry views AMD as a reliable alternative for generative AI and large-scale AI systems. One of AMD's strategies to compete with Nvidia includes its powerful MI300 series accelerator chips. Currently, AMD is directly challenging Nvidia's H100 dominance with more advanced GPUs and innovative CPU+GPU platforms.

AMD's newly released MI300 series consists of two major lines: the MI300X, a large GPU with leading memory bandwidth for generative AI and performance for training and inference in large language models, and the MI300A, which integrates CPU and GPU based on the latest CDNA 3 architecture and Zen 4 CPU, delivering breakthrough performance for HPC and AI workloads. Undoubtedly, the MI300 is not just a new generation of AI accelerators but also represents AMD's vision for the next generation of high-performance computing.

The MI300X accelerator was launched in 2023. The AMD MI300X boasts up to 8 XCD cores, 304 CU units, and 8 HBM3 cores with a maximum memory capacity of 192GB - 2.4 times that of NVIDIA's H100 (80GB). It delivers exceptional HBM memory bandwidth of 5.3TB/s and Infinity Fabric bus bandwidth of 896GB/s. The advantage of such substantial onboard memory is that it requires fewer GPUs to run large in-memory models, saving power consumption and hardware costs associated with multi-GPU configurations.

In December 2023, alongside launching its flagship MI300X accelerator, AMD announced that the Instinct MI300A APU has entered mass production. Estimated to begin deliveries in 2024, it's expected to become the world's fastest HPC solution upon release.

MI300A: Estimated to Begin Deliveries in 2024 The MI300A is the world's first data center APU for HPC and AI, combining CDNA 3 GPU cores, the latest AMD "Zen 4" x86-based CPU cores, and 128GB HBM3 memory. Through 3D packaging and the fourth-generation AMD Infinity architecture, it delivers the performance required for HPC and AI workloads. Compared to the previous generation AMD Instinct MI250X, it offers 1.9 times the FP32 performance per watt for HPC and AI workloads.

Energy efficiency is crucial in HPC and AI fields, as these applications are filled with data and resource-intensive workloads. The MI300A APU integrates CPU and GPU cores in a single package, providing an efficient platform while delivering the training performance needed for the latest AI models. AMD's internal innovation goal for energy efficiency is "30×25," aiming to improve the energy efficiency of server processors and AI accelerators by 30 times from 2020 to 2025.

Globally, the AI boom began in 2023 and will continue to be a focus in the industry in 2024. Unlike 2023, NVIDIA, which previously dominated the AI HPC field, will face challenges this year from AMD's MI300 series products. As major cloud service providers such as Microsoft and Meta began placing orders for Nvidia's MI300 series products one or two years ago, they also required ODM manufacturers to design dedicated AI servers using the MI300 series product line to diversify risks and reduce costs. Industry experts predict that the demand for AMD's MI300 series chips this year will reach at least 400,000 units, with the potential to hit 600,000 if TSMC provides additional production capacity support.

At the AMD Advancing AI event in San Jose, AMD CEO Lisa Su mentioned that data center acceleration chips, including GPUs and FPGAs, are expected to grow at an annual rate of over 50% in the next four years. The market size, which was $30 billion in 2023, is projected to exceed $150 billion by 2027. She noted that, in her years of experience, this pace of innovation is faster than any technology she has witnessed before.

According to Wells Fargo's forecast, while AMD's revenue from AI chips in 2023 was only $461 million, it is expected to grow to $2.1 billion in 2024, potentially capturing a 4.2% market share. Intel may also secure nearly 2% of the market share. This could lead to a slight decline in Nvidia's market share to around 94%. However, according to data released by Lisa Su during the January 30th conference call, AMD's AI chip revenue in Q4 2023 has already exceeded the previously forecasted $400 million. Additionally, AMD's AI chip revenue for 2024 is projected to reach $3.5 billion, higher than the earlier prediction of $2 billion. If AMD's forecast proves accurate, its market share in the AI chip sector is expected to further increase in 2024.

Of course, Nvidia won't simply allow its monopoly in the AI market to be eroded. In the current AI accelerator chip market, Nvidia's A100/H100 series AI GPUs, despite their high prices, remain the preferred choice. This year, Nvidia will launch even more powerful models—the Hopper H200 and Blackwell B100. According to some research institutions, Nvidia plans to sell approximately 1.5 to 2 million AI GPUs this year, potentially tripling its 2023 sales volume. This suggests Nvidia will fully resolve its supply bottlenecks. Facing competition from AMD and Intel, Nvidia may also adjust its pricing strategy accordingly.

Nvidia isn't just competing with AMD—tech giants are increasingly developing their own AI chips, adding to the competitive landscape. In February this year, tech giant Meta Platforms publicly confirmed its plan to deploy its latest self-developed custom chips in its data centers this year. These chips will work in coordination with other GPU chips to support the development of its AI large models.

Dylan Patel, founder of research firm SemiAnalysis, stated that considering Meta's operational scale, a successful large-scale deployment of these custom chips could potentially save hundreds of millions in annual energy costs and billions in chip procurement expenses.

Meanwhile, OpenAI has also begun seeking billions in funding to build a network of artificial intelligence chip factories. Foreign media reports indicate that OpenAI is exploring the development of its own artificial intelligence chips. OpenAI's website has begun recruiting hardware-related talent, with several positions in software-hardware co-design currently open. Additionally, in September last year, OpenAI recruited Andrew Tulloch, a renowned expert in the field of AI compilers, which further suggests the company's investment in developing its own chips.

Not just Meta and OpenAI, according to The Information, there are currently over 18 startups globally focused on designing chips for AI model training and inference, including Cerebras, Graphcore, Biren Technology, Moore Threads, and d-Matrix. These companies have collectively raised over $6 billion, with a total valuation exceeding $25 billion (approximately ¥179.295 billion).

The investors behind these companies include Sequoia Capital, OpenAI, Morningside Venture Capital, ByteDance, and others. If we include the chip development initiatives by tech giants like Microsoft, Intel, and AMD, the number of AI chip companies aiming to compete with Nvidia could exceed 20. From this perspective, although NVIDIA maintains compelling technological leadership in key growth areas such as data centers, gaming, and AI accelerators, the company faces growing competitive threats. NVIDIA will inevitably encounter greater challenges in 2024.