Three Major Challenges for AI Large Models: AI Capabilities, Commercialization, and Price Wars

baoshi.rao

Image source: Tianyancha

At the China Computer Conference forum, AI unicorn company Zhipu AI officially released its self-developed third-generation foundational large model series, ChatGLM3.

Zhipu AI stated that ChatGLM3 has improved in areas such as multimodal understanding, code modules, and web search capabilities, with inference speeds 2-3 times faster than the best open-source models. Additionally, due to the adoption of its self-developed Agent Tuning technology, it achieves a 10-fold improvement in intelligent planning and execution compared to ChatGLM-2. By leveraging Huawei's Ascend ecosystem, its computational inference speed has increased by over 3 times.

Public information shows that Zhipu AI, founded in 2019, is a spin-off from the Knowledge Engineering Laboratory team of Tsinghua University's Computer Science Department, representing the commercialization of Tsinghua's research achievements. CEO Zhang Peng graduated from Tsinghua University's Computer Science Department, while President Wang Shaolan holds a Ph.D. from Tsinghua's Innovation Leadership Program.

With the prestige of a top university and the booming AI industry, Zhipu AI has become a darling in the eyes of investors. From July to September this year, Zhipu AI secured five rounds of funding, achieving a valuation of 10 billion yuan, making it a unicorn in China's AI sector. Its investors include Meituan, Alibaba, Ant Group, Hillhouse Capital, and other major institutions.

Image Source: Tianyancha

Zhipu AI CEO Zhang Peng pointed out that the Zhipu AI GLM large model has been applied in various fields such as government affairs, finance, and energy. Partners include dozens of companies like Alibaba, Tencent Cloud, Volcano Engine, Huawei, Meituan, Microsoft, OPPO, and Haitian Ruisheng.

Huaxin Yongdao recently announced that it has signed a "Strategic Cooperation Agreement on AI Large Model Co-construction" with Zhipu AI. Subsequent strategic cooperation scenarios include policy knowledge organization and construction, customer service consultation, and risk identification.

Image Source: Huaxin Yongdao

Apart from Zhipu AI, several major Chinese tech companies including Tencent, SenseTime, and Huawei have publicly stated they've released dozens, even hundreds, of industry-specific large language model solutions. However, to date, the market hasn't seen these models significantly drive the financial performance of their parent companies.

SenseTime's interim report this year shows its generative AI business grew 670.4% year-over-year, increasing its contribution to group revenue from 10.4% in 2022 to 20.3%. Yet, SenseTime's overall revenue only grew 1.3% year-over-year to 1.433 billion yuan in the first half of this year.

iFlytek reported 7.84 billion yuan in revenue and 73.572 million yuan in net profit for the first half, representing year-over-year declines of 2.26% and 73.54% respectively - a sharp drop in both revenue and profits. 360 Company claimed 20 million yuan in AI service revenue for small and medium clients, which actually came from software membership fees and SaaS services for enterprise security cloud.

In contrast to the domestic market, large language models are driving revenue growth for overseas companies. According to The Information, ChatGPT is projected to generate over $1 billion in revenue within the next 12 months through AI software sales and computing power. Nvidia reported a 171% year-over-year increase in data center GPU chip-related revenue to $10.3 billion in Q2, with total net profit surging 843% to $6.188 billion.

Zhang Peng noted that benchmarking against OpenAI has been Zhipu AI's goal since its establishment. But can Zhipu AI achieve ChatGPT-level revenue and truly become China's version of OpenAI?

Although Zhang mentioned at the China Computer Conference that ChatGLM3 ranked first among domestic models of similar size across 44 Chinese and English public datasets, our tests on Zhipu AI's official website suggest its large language model still requires ongoing improvement.

In the context of intelligent customer service, we repeatedly asked the question 'Why does your beef sauce contain no beef?' to ZhiPu AI, but received three different answers, all of which were problematic. In the first response, ZhiPu AI attributed this to misleading product names, unclear ingredient lists, vague promotional materials, and product classification issues.

Image source: China Computer Conference

In the second response, ZhiPu AI stated that this was a seasoning sauce primarily made from soybeans, corn, and other ingredients. In the third response, it claimed the beef sauce mainly consisted of bean paste, chili peppers, peanuts, vegetable oil, etc. In other words, the second and third responses contradicted each other.

Image source: ZhiPu AI official website

Image source: Zhipu AI official website

Regarding Zhipu AI's response mentioned above, Hu Qiang (pseudonym), an e-commerce manager at a domestic beef sauce production company, told DoNews that according to relevant regulations, when a food product is named "beef sauce," the label must clearly indicate that the ingredients include beef. If we tell customers that our food labeling is unclear and that the product is blended with various ingredients, this would not only fail to reassure customers but might also lead to financial losses for the company due to false advertising.

Image source: JD.com

When customers question whether the beef sauce contains beef, customer service can provide the ingredient list and explain that the beef particles in their product might be smaller, making them less noticeable. In actual customer service work, representatives should learn to provide reasonable and evidence-based explanations rather than taking all blame onto the company or simply agreeing with everything the customer says.

In testing logical reasoning, we presented a math problem: Our company had 315 employees last year, with post-90s generation accounting for 1/5 of the total. This year, a batch of post-90s employees was hired, increasing their proportion to 30% of the total. How many post-90s employees were hired this year?

In its first response, Zhipu AI directly stated that the specific number depends on the company's total headcount X this year, which was essentially a non-answer. When we then asked what X was, Zhipu AI provided 210 people, which was clearly incorrect compared to the correct answer of 45.

Image source: Zhipu AI official website

In the second response, Zhipu AI initially gave a wrong answer of 11,300 people, which was even more than the total number of 315 people in the question. After realizing the mistake, it corrected to 314 people, which was still incorrect and did not meet the requirements of the question.

Image source: Zhipu AI official website

We increased the difficulty level by selecting a common high school math function question from Zuoyebang and asked Zhipu AI to answer only the first two parts. However, Zhipu AI provided incorrect answers for both parts. The consecutive errors in these two mathematical logic questions clearly contradict Zhipu AI's claims that its ChatGLM series models can solve complex reasoning problems.

Image source: Zhipu AI official website

Image source: Zuoyebang

Image source: Zhipu AI official website

At present, the main monetization methods for large models include: large models alone, large models + computing power, and large models + applications. Among these, large models alone and large models + computing power are the primary monetization methods.

Image Source: iAnalysis

Zhipu AI's profit model is fundamentally consistent with industry standards. Firstly, it provides customized large model development services based on client requirements. The highest prices for cloud-based and local private deployments are 1.2 million yuan/year and 36.9 million yuan/year, respectively.

Image Source: Zhipu AI Official Website

Secondly, for standard version large models, it offers API access with token-based usage fees. The pricing for ChatGLM-Turbo, CharacterGLM, and Text-Embedding is 0.005 yuan/thousand tokens, 0.015 yuan/thousand tokens, and 0.005 yuan/thousand tokens, respectively.

Tokens can be simply understood as "characters" or "words," but currently, there is no complete standard for tokens in the market. For models like Tongyi Qianwen, ChatGPT, and Wenxin Yiyan, 1 token is equivalent to 1 Chinese character. For Spark Large Model and Baichuan53B, it's 1.5 Chinese characters, while Hunyuan Large Model uses 1.8 Chinese characters per token. The definitions vary even more significantly for English among different large model companies.

Image Source: Public Information

In terms of pricing, except for ChatGPT, which charges approximately ¥1 per 1k tokens, other large model companies offer relatively cheaper rates. While this may increase the penetration of large models in the TOC (To Customer) segment, it also means that large model providers need to accumulate a massive user base to generate substantial revenue.

Image Source: Public Information

According to Qimai data, the average daily downloads for Wenxin Yiyan and iFlytek Spark on iOS have been below 20,000 over the past month. Additionally, considering the significant drop-off in user retention from download to the next day and seven days later, it is evident that the actual number of active users for these apps is notably insufficient.

Source: Qimai Data

Although Wenxin Yiyan has introduced a monthly subscription service following Microsoft's Copilot model, consumer users in China have been accustomed to free services in the mobile internet era, resulting in weak subscription awareness. For example, Tencent Music reported a subscription rate of only 16.7% in Q2 this year, significantly lower than Spotify's over 40% subscription rate.

Source: Wenxin Yiyan

On the consumer (ToC) side, users show strong initial interest in large models, but their current capabilities offer limited user retention. If companies start charging ToC users, they risk entering a vicious cycle: user churn leads to increased advertising costs to acquire new users, which in turn leads to further churn.

In the business (ToB) sector, China's corporate net profit margins lag behind those of European and American companies, and domestic firms generally lack strong willingness to pay for enterprise software. This is evident from the proportion of software revenue in GDP between China and the U.S. Moreover, in the current environment where small and medium-sized enterprises (SMEs) and private companies are focused on cost reduction and efficiency to survive, they naturally prioritize return on investment.

Image source: wind

However, customizing large models for the ToB market is extremely costly. Beyond paying millions to large model providers for customization, companies must also bear expenses for data preparation and preprocessing, model training and optimization, deployment and maintenance, model updates and iterations, regulatory compliance, and internal personnel allocation.

For example, the iFlytek T20 learning tablet, which incorporates the Spark large model, is priced 2,000 yuan higher than the T10 model. Insiders reveal that this price increase still doesn't cover the cost of the large model. Meanwhile, standard learning tablets on the market have total hardware and licensing costs of only about 1,000-1,500 yuan, further highlighting the exorbitant costs of large models in commercial applications.

Under high-cost investments, the timing of profitability remains uncertain. Taking physical products equipped with large models as an example, consumers are increasingly educated by the concept of 'lowest prices online,' making excessively high product prices likely to deter purchases. For instance, the iFLYTEK T20 learning tablet, priced at over 8,000 yuan, has relatively few reviews on JD.com, while its front-end sales on Douyin show 4,000+ units sold. However, considering the high refund rates on interest-based e-commerce platforms, the actual sales volume is likely much lower. Lowering product prices might boost sales but would not cover the costs of large models. Faced with this dilemma, how will companies choose?

Image source: JD.com, Douyin

Virtual products powered by large models also face similar issues. Although text generation is a common capability of current large models, Liu Wei, a manager at a domestic self-media company, revealed that platforms like Toutiao, Baidu, and Douyin largely restrict the reach of AI-generated videos and graphics, resulting in dismal engagement metrics such as likes and comments.

Image source: Douyin

According to the traffic revenue calculation of 10 yuan per 10,000 reads on platforms like Toutiao and Baijiahao, a million-yuan investment would require at least tens of millions of reads. However, in today's information explosion era where users seek differentiated content, achieving this is highly challenging. While large models are said to reduce labor costs, the million-yuan investment far exceeds traditional labor expenses.

Beyond the return on investment, Zhang Dong (pseudonym), a long-time ToB sales professional, shared that many business owners he communicated with have little understanding of large models or how they can benefit daily operations. In contrast, they are very familiar with ERP software from vendors like Kingdee and Yonyou, with many having used such systems for years.

Even among those aware of large models, issues like mismatched use cases, data security risks, and skepticism about model capabilities persist. Particularly concerning is how some AI customer service models simply echo customer statements, making many business owners hesitant to adopt them. Overall, only large enterprises with substantial financial resources may be willing to invest in large models.

Reports from iResearch also indicate that energy and finance are currently the industries seeing faster commercialization of large models, primarily due to the concentration of state-owned enterprises in these sectors. These enterprises have robust data infrastructure, high computing power investments, and numerous well-established AI application scenarios, facilitating quicker integration with large models.

Image Source: iResearch

Regarding the newly released ChatGLM3 model, Zhang Peng mentioned that its pricing has reached the lowest level in China and is among the most competitive globally for large model APIs. However, with the reduction in computing costs during training and inference phases, prices for large models are expected to continue declining.

Taking NVIDIA's two GPU products, H100 and A100, as examples: Public data shows that while the H100's computing power has increased by about 6 times compared to the A100, its price has only risen by approximately 3 times, resulting in a significant decrease in cost per unit of computing power. In other words, ChatGLM3's future pricing cannot be considered the absolute lowest.

Moreover, with the increasing supply of large models and more open-source enterprises entering the market, China's domestic market size is projected to be only 12 billion yuan in 2024. In the short term, buyers will primarily be state-owned enterprises and central enterprises with strong financial capabilities and clear demand scenarios. In this situation of 'many monks and little gruel,' the large model market is likely to become mired in price wars, similar to the cloud and SaaS industries.

Image source: iResearch

This means that in addition to competing on technical capabilities, large model companies will need to enhance their comprehensive sales capabilities to secure more orders in buyer tenders. However, sales capability may be a significant weakness for technology-focused companies like Zhipu AI, especially when compared to manufacturers like Huawei and Alibaba, which have stronger customer bases.

Taking Huawei as an example, the company, which started with ToB services, has accumulated a large number of clients, including state-owned enterprises and central enterprises, with dedicated teams to follow up on their needs. When these clients require large models, Huawei can quickly intervene. Moreover, for ToB sales, Huawei can spread costs through cross-selling. In other words, even if Huawei's Pangu model offers low prices to clients, it can still profit through other means later.

Therefore, Zhipu AI may need to gradually shift from a technical mindset to a sales-oriented one. But will its internal technical staff be willing to accept this change?

A more pressing issue is that, although multiple rounds of capital investment have provided sufficient funding for Zhipu AI's R&D, this has also led to highly dispersed equity ownership. In the future, this could result in conflicts among shareholders with differing interests, including long-term strategies, short-term goals, industrial demands, and capital expectations. The complexity of these conflicts between shareholders and the company may lead to frequent disputes, slower market responsiveness, and prolonged decision-making processes. Balancing the interests of major shareholders will be a significant test of Zhang Peng's leadership skills.

Image Source: Tianyancha

Amid the price war in large models, Zhipu AI's revenue and profits are bound to be affected. As capital enthusiasm for large models cools, will there be a scenario where investors cash out and exit?

<p style="box-sizing: border-box; margin-top: 0px; margin-bottom: 26px; padding: 0px; border: 0px; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-variant-alternates: inherit; font-stretch: inherit; line-height: inherit; font-optical-sizing: inherit; font-kerning: inherit; font-feature-settings: inherit; font-variation-settings: inherit; vertical-align: baseline; font-family: "PingFang SC", "Lantinghei SC", "Helvetica Neue", Helvetica, Arial, "Microsoft YaHei", 微软雅黑, STHeitiSC-Light, simsun, 宋体, "WenQuanYi Zen Hei", "WenQuanYi Micro Hei", "sans-serif"; -webkit-font-smoothing: antialiased; word-break: break-word; overflow-wrap: break-word; color: rgb(38, 38, 38); text-align: justify; text-wrap: wrap; background-color: rgb(255, 255, 255);">围绕ToB端的大模型商业化，这条路注定坎坷，毕竟SaaS产业、云产业已经有了前车之鉴。因此，如何在大模型商业化真正爆发前，穿越黎明前的寒冬，这是包括智普AI在内的每家大模型企业都必须思考的问题。</p><p><br/></p>