Kai-Fu Lee: China's Large Model Competition is Exceptionally Fierce, Ultimately There Will Be a Few Big Winners

baoshi.rao

December 28 news, according to venture capitalist and former Google China president Kai-Fu Lee's prediction, China's generative AI startups are undergoing a 'preliminary round'. Earlier this year, he founded 01.AI, a Beijing-based startup focused on developing large language models (LLMs). Last month, the company completed a funding round with a valuation of up to $1 billion.

China's largest internet companies, such as Alibaba, Tencent, Baidu, and ByteDance, along with numerous startups, are racing to develop their own large language models. Some media have dubbed this the 'Hundred Models War,' as these tech companies fiercely compete for dominance in the AI field. In an interview, Kai-Fu Lee stated that these companies are currently in the phase of proving they possess the technology to develop high-quality models. Those that pass this test will move on to the next stage: figuring out how to increase revenue and achieve profitability.

Kai-Fu Lee predicted: 'In China, ultimately a few big winners will emerge, while some companies may exit the market gracefully. But most businesses will either drop out halfway or shift to more practical goals, such as building applications and solutions for specific industries.'

01.AI was established in March this year and currently has over 100 employees, most of whom work in Beijing. Last month, the company released its first open-source large language model, Yi-34B, but its future revenue won't solely rely on this model. Instead, its business plan focuses on selling proprietary large language models, primarily targeting the Chinese market. According to Kai-Fu Lee, the company is currently developing a new proprietary model with over 100 billion parameters.

However, after Yi-34B quickly topped Hugging Face's open-source large language model rankings, 01.AI sparked some controversy. Some developers discovered that the model appeared to use Meta's open-source AI model Llama without mentioning it in the relevant documentation. In response, 01.AI later renamed part of Yi-34B and publicly acknowledged Llama's contribution. Kai-Fu Lee also publicly apologized for their previous oversight.

In an interview with The Information, Kai-Fu Lee discussed 01.AI's future and the trends in China's AI industry. He also talked about how to deal with U.S. chip export restrictions and how Chinese companies can seek business opportunities globally.

The full interview is as follows:

Q: Currently, dozens of companies in China are racing to develop large language models. What will happen next?

Kai-Fu Lee: In my view, this situation is nothing new in China. We've seen it before with the group buying frenzy, the rise of bike-sharing apps, and even in deep tech areas like computer vision and speech recognition. When computer vision proved its value, countless Chinese companies rushed in, trying to get a piece of the action across various potential applications. However, most of these companies didn't survive.

Today, China's AI sector is still in the qualifying round, with competition being exceptionally intense—perhaps even more so than in the U.S. The first test we need to face is: Among the hundred schools of thought contending, which company can develop a model that is truly high-quality and high-value? Only with solid technology and outstanding model performance can a company stand out in practical applications. Otherwise, the technology will remain a "toy" rather than a solution to real problems.

After passing the technical challenges of the qualifying round, companies will enter the next phase: How to increase commercial value? What is your business model? How will you generate profits? Soon, investors will ask the same questions they pose to cloud providers, enterprise software companies, and consumer app developers. If companies can't provide clear answers, their growth will come to an end.

Taking the United States as an example, OpenAI has demonstrated its technological leadership while also generating revenue. This value creation makes other companies willing to invest resources and build applications on top of it.

In China, a few major winners will eventually emerge, while some companies may exit the market gracefully. However, most enterprises will either give up halfway or shift to more practical goals, such as developing applications and solutions for specific industries, rather than solely pursuing large-scale model development. Over time, the cost of developing large models will continue to rise.

Q: Chinese AI startups and their investors claim that China will develop its own ecosystem for generative AI models and applications. What do you think?

Kai-Fu Lee: We all understand that parallel universes are not what we want to see. We prefer to compete globally, allowing truly outstanding companies to stand out, as this is more efficient. However, the reality is that we cannot fully control our own destiny.

Particularly geopolitical issues. If we want to enter the US market, although there are no rules saying we can't, I don't think we'll get much business. Because in my view, there's currently an unfair bias against Chinese software in the US market. This is a reality we have to face.

Of course, we're open to business opportunities in other parts of the world, but we're well aware that some things just won't work. For example, trying to sell our proprietary models to US companies is almost impossible. They won't buy it, and we won't waste our effort.

China clearly represents a huge opportunity, but I wouldn't rule out other regions of the world where Chinese companies might enter. Overall, Silicon Valley's approach has been 'one-size-fits-all,' a model that played a key role in the rise of companies like Facebook and Google and helped the US gain dominance. But this time it's different because large language models are trained on data. Data involves issues of bias, ideology, and values. American values aren't welcome in all countries, not just in China, but even in some countries where they're completely unacceptable.

I think the Middle East might be another region that wants to think differently about these issues. This is prompting countries to want more control over the models.

I firmly believe it's possible to build specialized models for different countries. Silicon Valley companies won't do this because they consider their values correct and want more people to accept and integrate into them. Moreover, developing different large models for different markets requires significant engineering efforts, so Silicon Valley companies naturally hesitate to invest in such models. Companies from other regions (including China) might have opportunities to deeply explore this approach. But obviously, they must earn the trust of users and governments.

Q: Media reports suggest your company successfully reduced the AI training costs for Yi-34B. How did you achieve this?

Kai-Fu Lee: We have an exceptionally strong infrastructure team, which happens to be our largest department. I've told our employees before - every additional modeling staff increases GPU burden, but every additional infrastructure staff improves GPU efficiency. Of course, the modeling team is also important, but from the very beginning, we've placed special emphasis on building our infrastructure team.

These infrastructure team members are like unsung heroes. They are responsible for hardware, software, and massive data transfers, while simultaneously managing graphics processing units, memory, and networks—any one of which could become a bottleneck. It's worth noting that scaling graphics processing units beyond a few thousand is extremely difficult. When expanding from 2,000 to 8,000 units, it's not simply a matter of software adjustments, because as models and data volumes grow, network requirements undergo dramatic changes.

Our infrastructure team consists of dozens of engineers, making it the largest team at 01.AI. They are researching how to use FP8 (NVIDIA H100 chip's data format) to significantly reduce computational load—no easy task. They need to determine where to use FP8 and where to employ other formats while ensuring seamless transitions between them. Beyond this, they must tackle a series of thorny issues, such as selecting network protocols, optimizing compilers, and handling GPU failures. In fact, GPU failures occur alarmingly frequently. If a GPU fails, can it be hot-swapped? We're still working on solving this problem. Consider this: in a cluster with thousands of GPUs, if training stops for an hour due to one failed GPU, implementing hot-swapping could save you an hour each day! These small time savings add up significantly.

Another related topic is elastic training. Suppose you have a cluster composed of 2,000 H100 chips, but you only need 500 to perform a certain task. Can you remove them between checkpoints and then add them back? These tasks are not what AI researchers should be handling; they fall more under the purview of network engineers.

If we compare the development of large language models to rocket science, it's like rockets would never take off without engineers. SpaceX's success isn't just due to its large number of researchers but also because it has undertaken a tremendous amount of highly complex engineering work. Similarly, our infrastructure team serves as our 'engineers,' whose efforts enable our large language models to successfully launch!

Q: The US restricts the export of advanced semiconductor technology to China, including Nvidia's advanced chips. How is 01.AI responding to this?

Kai-Fu Lee: I've publicly stated that our chip inventory is sufficient to last for 18 months. These are basically chips we obtained before the restrictions were imposed. We are certainly working hard to research how to use Chinese chips. But it's not easy, and it's definitely not fun. Programming them is not in our familiar territory. But if we have to do it, we won't back down.

NVIDIA has outstanding chips, but some might argue that simpler chips could do the job at a lower cost. However, a major factor behind NVIDIA's strength is its entire ecosystem built around the CUDA software library, which makes programming easier. If engineers were forced to use non-NVIDIA chips, they might resist because such chips are far less efficient. But the dilemma we currently face won't become apparent until 18 months from now, and we must act sooner. If we can't obtain NVIDIA's chips, we'll have to look for simpler alternatives, more focused on transformers, though programming them will be a painful process. Yet, if we have no other choice, this is what we'll do.

But as everyone knows, Chinese engineers have the capability, the willingness, and excel at tackling such arduous engineering challenges. This is similar to what I previously mentioned about the work of infrastructure teams. Learning to program new non-standard GPUs with very few libraries is also a labor-intensive task.

Chinese entrepreneurs are tenacious. Chinese engineers are hardworking. They aren't afraid of heavy workloads. This is precisely why Meituan delivers excellent services and why WeChat has become an outstanding product. Indeed, there are many difficult challenges ahead, and one might argue they waste time and the energy of many people. But these are the cards we've been dealt, so we'll do our best to play them well.