NVIDIA Launches GH200 AI Chip, Jensen Huang Praises: 'It Will Spark a New Revolution in Inference!'

baoshi.rao

NVIDIA announced a new AI chip configuration at the SIGGRAPH conference in Los Angeles on Tuesday. Founder and CEO Jensen Huang stated that the new chip accelerates generative AI applications, reduces the operational costs of large models, and enables data center scalability.

The newly released GH200 AI chip uses the same GPU as NVIDIA's current top-tier AI chip, the H100, but is equipped with 141GB of memory and a 72-core ARM chip, compared to the H100's 80GB memory.

"This chip is designed for the horizontal scaling of global data centers," Huang said at the launch event.

WeChat Image_20230630144016.png

Huang also mentioned that the new chip will be available through NVIDIA's distributors starting in the second quarter of next year, with samples provided by the end of this year. However, the price of the chip has not yet been disclosed.

NVIDIA Vice President Ian Buck stated at a media briefing that the new version of the chip increases the amount of high-bandwidth memory, a design that enables the operation of larger AI models. The GH200 is optimized to perform AI inference functions, effectively supporting generative AI applications like ChatGPT.

The release of the new chip comes as the scale of AI models continues to expand. "As model parameters increase, they require more memory to run on separate chip systems without interconnections. The additional memory enhances GPU performance," Buck explained.

Currently, NVIDIA dominates the AI chip market, with an estimated market share of over 80%. For example, Google's Bard and OpenAI's ChatGPT both run on NVIDIA GPUs. As tech giants, cloud service providers, and startups worldwide compete for GPU resources to develop their own AI models, NVIDIA's chips are in high demand.

Typically, the process of using AI models is divided into at least two parts: training and inference. First, models are trained using vast amounts of data, a process that can take months and sometimes requires thousands of GPUs. Then, the models use inference in software to make predictions or generate content. Like training, inference is computationally expensive and requires significant processing power each time the software runs. Unlike training, however, inference is almost continuous, while training is only needed when models require updates.

"You can run almost any large language model you want on the GH200, and it will perform inference like crazy," Huang said. "The cost of inference for large language models will drop significantly."

NVIDIA also unveiled a system that combines two GH200 chips into a single computer, suitable for even larger models. Huang called it "the world's largest single GPU."

Amid the shortage of AI chips, NVIDIA's main competitor AMD launched the MI300X AI chip last week, which supports 192GB of memory and has AI inference capabilities. Companies like Google and Amazon are also designing their own custom AI inference chips.

Another highlight of NVIDIA's event was progress on OpenUSD. Recently established by five major companies in the U.S. 3D content industry—Apple, NVIDIA, Pixar, Adobe, and Autodesk—OpenUSD aims to potentially become the 3D graphics standard for the "metaverse." The organization is promoting greater interoperability of 3D tools and data, enabling developers and content creators to describe, compose, and simulate large 3D projects and build an expanding range of 3D products and services.

At this year's SIGGRAPH, IBM Senior Vice President Darío Gil delivered a keynote addressing the future of quantum computing and its potential to solve real-world problems. Sony CTO Hiroaki Kitano also hosted a forum on the creative film industry at the conference.

SIGGRAPH has consistently served as a platform for showcasing cutting-edge mixed reality (XR) research, and this year was no exception. Meta demonstrated two VR and MR headset prototypes: Butterscotch Varifocal, which combines varifocal technology with retina-resolution VR displays, and Flamera, a computational camera utilizing light field technology. Both Butterscotch Varifocal and Flamera remain in Meta's R&D phase, but these technologies may inspire future consumer electronics products.