4 Years to Reach a Valuation of 10 Billion: The Top Domestic Large Model Startup, Born at Tsinghua

baoshi.rao

Large model company Zhipu AI announced its annual financing amount: 2.5 billion yuan.

What does this figure signify? It sets a new record for cumulative financing among domestic large model startups, with a valuation exceeding 10 billion.

This four-year-old company has become the most financially attractive and highly valued domestic large model firm.

As the 'second phase of generative AI' in the 'hundred-model battle' unfolds, does Zhipu's financing progress also serve as evidence of the Matthew Effect in the industry?

Companies with strong prospects receive more resources, while those failing to prove their worth face reshuffling and exclusion from the next phase of competition.

From the perspective of startups, this marks the end of the first half of large model entrepreneurship, with a clear divide in the landscape and a defined ecosystem hierarchy.

Before understanding why Zhipu stands out in domestic large model financing, let's trace its origins.

Previously, it was well-known within the industry but relatively unknown outside.

Zhipu AI, founded in June 2019, emerged from the technological achievements of Tsinghua University's Knowledge Engineering Group (KEG) lab.

The core team members are almost all Tsinghua alumni, including CEO Zhang Peng, a graduate of Tsinghua's Computer Science Department and a 2018 Tsinghua Innovation Leader Engineering Ph.D.

At the KEG lab, the team focused on applying machine learning, data mining, and knowledge graphs to engineering practices, and began training AI models as early as 2017.

On Zhipu's first anniversary, OpenAI released GPT-3.

From then on, Zhipu dedicated itself to the research and development of large language pre-training models. On the path of large models, OpenAI chose GPT, Google chose BERT, and Zhipu chose GLM (General Language Model).

As the saying goes, 'you reap what you sow.' Almost all subsequent developments revolved around Zhipu's unique GLM pre-training architecture:

In 2022, Zhipu collaborated with Tsinghua to develop the bilingual trillion-parameter large model GLM-130B, using it as a foundation to build a large model platform and product matrix.

In 2023, Zhipu was highly active, launching the conversational model ChatGLM and open-sourcing the single-card version ChatGLM-6B, followed by the visual model Visual-6B, code model CodeGeeX2, mathematical model MathGLM, multimodal model CogVLM-17B, and Agent model series AgentLM—all open-sourced.

On August 31 this year, Zhipu's generative AI assistant 'Zhipu Qingyan,' based on the bilingual conversational model ChatGLM2, became one of the first 11 large model products approved for public use.

In recent years, Zhipu's focus in the large model field has been clear: solidifying the foundation (base models) and then constructing various modal and functional 'buildings' on top.

Notably, Zhipu has been able to sustain itself with B2B services since its early days, giving the company the confidence to frequently develop and release new models and products despite the widely acknowledged high costs of large models.

Of course, this isn't the only reason.

Beyond service capabilities and revenue, Zhipu boasts a strong talent pool and technical prowess.

As mentioned earlier, Zhipu originated from Tsinghua, and the 'Tsinghua affiliation' has become a prestigious label in this field.

The reason lies in Tsinghua's early involvement in large model research, with long-term experience nurturing many talents—today's notable players, represented by Zhipu, include MoonShot AI, DeepLang, OneFlow, Baichuan AI, FaceWall AI, XianYuan Tech, and ShengShu Tech, all with Tsinghua roots.

Their paper citations and model capabilities serve as strong evidence of this 'recognized label.'

Moreover, according to public records, Zhipu is the only fully domestic, self-developed large model enterprise.

This background equips Zhipu with its own preparations and strategies amid ongoing discussions and controversies over 'model security, data security, and content security.'

It is reported that to align with domestic GPU development, Zhipu is implementing the GLM General Language Model Domestic Chip Adaptation Plan.

Specifically, this involves collaborating with domestic computing chip manufacturers to adapt model algorithms for domestic chips, with nearly 10 types of domestic chips currently supported.

These achievements and unique traits may explain why Zhipu has been consistently favored and stands out.

However, being highly regarded and amassing substantial capital, Zhipu has also demonstrated its resolve to build long-term competitiveness.

After securing 2.5 billion yuan in financing within 10 months, Zhipu AI stated:

The financing will be used for further R&D of base large models, better supporting the industry ecosystem, and fostering rapid growth with partners.

In essence, the focus is on two major aspects:

Strengthening and solidifying the base large model.
Expanding the ecosystem and partnerships.

Depth and breadth are both essential.

First, strengthening the base large model through 'further R&D.'

Currently, Zhipu's foundational model is the bilingual bidirectional dense model GLM-130B, released in 2021 with 130 billion parameters.

At the time, constrained by technology, data, and computing power, training such a large model was daunting, but the results were remarkable—GLM-130B outperformed GPT-3 and PaLM in some aspects.

However, with today's growing data and modal demands, the 130-billion-parameter behemoth seems inadequate.

According to latest information obtained by QuantumBit, this Friday (October 27), Zhipu will announce a new move—launching a next-generation base large model.

Second, expanding the ecosystem and partnerships.

In practice, this aligns with Zhipu's consistent principle: continuous open-sourcing.

The company has long been one of the most open players in the large model field. Even in the pre-ChatGPT era, it adopted a transparent and open approach alongside Baidu (ERNIE2.0), Alibaba (AliceMind), Zhiyuan (Qingyuan CPM), and Langchao (Mencius Large Model).

Reviewing Zhipu's early GLM reports reveals statements like 'We invite everyone to join its open community to advance large-scale pre-training models.' Today, the company continues to engage developers and industry users through open-source initiatives.

This practice persists.

Current data highlights the阶段性 results of Zhipu's open-source commitment:

Developer community: ChatGLM-6B topped Hugging Face's trending chart within four weeks of release, with over 10 million downloads and 50,000+ GitHub stars.
Ecosystem partners: Zhipu's website lists collaborations with '1,000+ research institutions across 69 countries.' Verified by QuantumBit, its client base exceeds 1,000, with 200-300 actively contributing to the open-source ecosystem.

Once a large model ecosystem is established, it better integrates resources across the foundational, intermediate, and application layers, optimizing configurations for healthy interaction and co-evolution.

Within this, base large models hold a core position due to their foundational and general-purpose nature. Recognizing this makes Zhipu's efforts to expand its ecosystem and partnerships both advantageous and necessary.

In late November last year, OpenAI introduced ChatGPT to the world. Soon after, large model technology trends surged at an unprecedented pace.

The data is staggering, and the speed of manifestation is equally astonishing.

Hundreds of millions of active users, billions in revenue, and valuations in the tens of billions... Large models have swept the world with unrelenting force. Everyone is watching, exploring, and pondering how far this AI technology can go and how its products can harness its power.

Thus, pioneers like OpenAI and Anthropic emerged abroad, while domestically, unicorns like Zhipu AI and MiniMax, valued at tens of billions, have taken shape.

With precedents set, certain technical and engineering challenges cannot be fast-forwarded or skipped. No matter how star-studded the team or how astronomical the funding, anyone venturing into large models must go through the process firsthand.

The challenges are formidable, but challengers press forward, finding joy in the struggle.

Today, nearly a year later, we have witnessed the advancement of large model technology and how innovation and competition are shaping this field.

What has become even clearer is that giants have completed their initial positioning, startups are beginning to shuffle, and the first-phase landscape is taking shape.

Indeed, no single company can accomplish everything within the scope of large model capabilities. But the tickets for general-purpose large models are limited. Players unable to secure a spot are branching out: either specializing in industry-specific models or abandoning model-layer entrepreneurship to build on others' models, moving toward the middle or application layers...

Large model entrepreneurship is entering a watershed moment.

From now on, funding progress in large model startups will likely become increasingly concentrated. Billions in funding will continue to flow to companies that are already well-funded.

The Matthew effect in the industry is intensifying. With finite capital, the most valuable companies will attract even more attention, and the best and most abundant resources will be handed to the most promising contenders.

In the capital market, the only downside of expensive companies is their price, and the only upside of cheap companies is their affordability.

The first half of large model entrepreneurship is about to end.