Who Has Taken the Lead in Reaping the Fruits of Large Models?

baoshi.rao

At the end of 2022, OpenAI reignited the fervor for artificial intelligence with ChatGPT, sparking a global competition centered around large models.

However, today, the most discussed topics are no longer scale or computing power. Even though AGI has become a consensus, with industry leaders confidently stating that AGI could become a reality within five years, as the battle over large models enters its second half, both investors and major corporations are more concerned with how to commercialize large models first.

On March 26, SenseTime Group (hereinafter referred to as "SenseTime") released its financial report for the fiscal year ending December 31, 2023. One notable highlight is that in 2023, SenseTime's generative AI business generated revenue of 1.2 billion yuan. This marks the fastest-growing new business for SenseTime in its ten-year history to reach a revenue scale of one billion yuan.

As one of the most sought-after AI companies in China, SenseTime's progress in the generative AI business not only signifies its entry into a new era but may also provide valuable insights for peers seeking commercialization:

How did SenseTime achieve this? Commercialization Closed Loop

Compared with 2022, SenseTime's generative AI business revenue surged by 200% in 2023, increasing its overall proportion from 10% to 35%.

This growth is attributed to SenseTime's strategic focus on generative AI.

In 2023, SenseTime clearly divided its business into generative AI, traditional AI, and smart vehicles. Non-generative AI businesses previously categorized under smart cities, smart commerce, and smart living were merged into the traditional AI segment. SenseTime's vision and strategic goals have also shifted to "making AGI the core strategic objective, aiming for major breakthroughs in AGI technology in the coming years."

The reason for this shift is that generative AI requires highly focused investment. Under the guidance of Scaling Law, substantial investment is fundamental. According to a New York Times interview with OpenAI founder Sam Altman, ChatGPT alone consumes up to 500,000 kilowatt-hours of electricity daily. Following OpenAI's release of ChatGPT, SenseTime emerged as one of the most responsive and persistent players in the field. On April 10 last year, SenseTime officially launched its 'SenseNova' large model system. By February this year, the SenseNova model had undergone four iterations, reaching version 4.0. Reports indicate that SenseNova V4.0 has achieved capabilities comparable to GPT-4 in multiple scenarios, including code writing, data analysis, and medical Q&A.

While continuous investment in large models forms the foundation, SenseTime's accurate judgment of real-world demands has been the golden touch driving the rapid development of its generative AI business.

Currently, AI has become a crucial competitive dimension across various industries, including smartphones, computers, social media, healthcare, and finance. For instance, in the smartphone industry, intelligent terminal models capable of understanding user commands and coordinating various applications to complete complex tasks have become key selling points.

However, due to the high costs associated with training general-purpose large models, most manufacturers prefer to access generative AI capabilities through direct API calls.

SenseTime's new 'Model as a Service' (MaaS) business model perfectly aligns with this surging demand. By fine-tuning and deploying various generative AI capabilities on its large-scale infrastructure, SenseTime enables clients to bypass the need for building their own foundational systems, significantly reducing costs. Generally, there are three scenarios: first, the public cloud standard, which involves calling APIs; second, private cloud, providing dedicated models and model authorization services for clients with security needs; third, customized model services.

For example, several well-known banks, including China Merchants Bank and Bank of China, have adopted the Rixin large model to build digital customer service systems. Hospitals such as the First Affiliated Hospital of Zhengzhou University, Xinhua Hospital affiliated with Shanghai Jiao Tong University School of Medicine, and Ruijin Hospital affiliated with Shanghai Jiao Tong University School of Medicine have chosen to use the "Dayi" model to empower various real-world hospital scenarios, including medication consultation, patient follow-up, and clinical decision support.

The financial report mentions that in the enterprise sector, over 70% of clients in the generative AI business are new customers acquired by SenseTime in the past year, while the remaining 30% of existing customers have seen an average spending increase of about 50%. On the consumer side, the Rixin large model has driven a nearly 120-fold increase in API calls within half a year.

The development of generative AI has also spurred transformations in traditional AI businesses and the smart automotive sector. Taking the smart automotive business as an example, as the largest application scenario combining AI technology with traditional manufacturing, the influence of large models continues to grow. With Tesla rolling out the FSD v12 Beta version of its autonomous driving software in the U.S., the end-to-end technical solution based on large model architecture has become the optimal approach for next-generation autonomous driving. Thanks to the foundational capabilities of its in-house large model, SenseTime's 'Jueying' smart car business has seen rapid growth: mass production and delivery surged by 163% year-on-year, while revenue increased by 31%.

However, whether it's training large model capabilities or deploying them on the edge, both are long-term and challenging endeavors. So, where does SenseTime's confidence to make such significant investments come from?

The Turning Point of AI 2.0

In the context of the tech industry, AI is not a new term.

With the birth of the Transformer algorithm in 2017 as a dividing line, AI has been categorized into two eras. The earlier era focused more on smaller models with fewer parameters, tailored for specific scenarios to achieve particular capabilities, while the latter is more general and foundational. This does not mean that the experience and knowledge enterprises accumulated during the AI1.0 era cannot play a role in the AI2.0 era. On the contrary, SenseTime's past achievements in perceptual intelligence and decision intelligence have become key drivers for the rapid development of its generative AI business.

During the AI1.0 era, SenseTime not only built a vast repository of algorithm models in computer vision—covering everything from visual signal analysis to digital content generation—but also independently developed technologies such as speech recognition (ASR), semantic understanding (NLP/knowledge graphs), speech synthesis (TTS), and speech-driven animation (STA). These capabilities significantly strengthen its foundational models' understanding of the physical world and multimodal processing.

For example, in the smart device sector, SenseTime's optimizations in small models give it an edge over competitors like Meta's Llama2 and Google's Gemma, even with similarly sized 7B models. In 2023, Qualcomm and MediaTek showcased applications of SenseTime subsidiary HuiLing's generative AI edge-side models on their flagship chips. SenseTime's 7B small model achieved an industry-leading inference speed of 16 tokens per second on Qualcomm's latest chipset.

Before diving into generative AI, SenseTime had already empowered multiple vertical industries—spanning smart cities, smart business, smart automotive, and smart living—with over 20 real-world applications, including smartphones, finance, and healthcare. This deep industry penetration allows SenseTime to better identify where generative AI is needed across sectors and how to tailor solutions accordingly. More importantly, SenseTime's forward-looking layout in infrastructure is now playing a tremendous role.

If the infrastructure of the industrial revolution era was electricity, railways, canals, and ports, then the infrastructure of the large model era is computing power represented by GPUs. OpenAI CEO Sam Altman once stated, "Computing power is the most important currency of this era." This concerns both cost and efficiency.

As early as 2018, SenseTime began building its own computing power centers and, on this basis, developed the SenseCore AI infrastructure. In 2022, SenseTime's AI Data Center (AIDC) in Shanghai's Lingang officially began operations, becoming one of the largest AI computing centers in Asia. By 2023, it had expanded to include new computing nodes in Shanghai, Shenzhen, Guangzhou, Fuzhou, Jinan, and Chongqing.

Performance reports show that SenseTime's total computing power has reached 12,000 petaFLOPS, doubling since the beginning of 2023, with GPU capacity reaching 45,000 cards, enabling large-scale model training capabilities of 10,000 cards and 10,000 parameters.

Computing power is just the computational infrastructure of SenseTime's large-scale setup. Above this, it also includes two architectural layers: the model layer and the deep learning layer, corresponding to algorithm model generation and algorithm model training, respectively. To enhance the efficiency of computing power supply, SenseTime jointly developed the DeepLink open computing system. Based on this, various domestic chips can easily adapt to mainstream large model training frameworks and algorithm libraries. This year, SenseTime has expanded compatibility to include mainstream domestic chips such as Huawei Ascend and Cambricon, supporting training, fine-tuning, and inference services for large models. This provides large-scale, high-efficiency, and intensive computing infrastructure services, significantly improving computing power utilization.

In simple terms, it reduces costs and increases efficiency. According to reports, SenseTime's large-scale infrastructure can maintain a 90% acceleration efficiency in large model training services, offering uninterrupted stable training for 30 days. The diagnostic recovery time in case of training interruptions has also been optimized to just half an hour.

Additionally, SenseTime's large-scale infrastructure supports parallel training of ultra-large models with 2 billion parameters (using thousands of cards in parallel) and has added support for multimodal models and mixture-of-experts models.

This is why SenseTime can withstand pressure and become one of the few companies capable of rapidly iterating large models. Since its launch in 2023, the capabilities of SenseTime's "Rixin" large model have significantly improved every three months. According to Frost & Sullivan's "AI Large Model Market Research Report (2023)," SenseTime's AI large model ranked first domestically in 2023 in terms of comprehensive competitiveness, including product technology, strategic vision, and open ecosystem construction. A Future of Co-creation

IDC's latest 2024 V1 edition of the Worldwide Artificial Intelligence and Generative AI Spending Guide reveals that the AI industry is experiencing rapid growth in both investment scale and market size.

In 2022, the global total investment in artificial intelligence (AI) IT reached $132.49 billion. It is projected to grow to $512.42 billion by 2027, with a compound annual growth rate (CAGR) of 31.1%. Notably, generative AI technology is expected to account for 33.0% of China's AI market investment share by 2027.

IDC also highlights that the generative AI market could achieve a CAGR of 85.7%, with the global generative AI market size approaching $150 billion by 2027.

However, as predicted in an article published by Sequoia China last year, the current focus of the AI wave is on leveraging new technologies to solve real-world problems end-to-end: model capabilities and commercialization paths are inherently two sides of the same coin. This requires both co-creation and individual effort. This precisely reflects SenseTime's potential.

According to financial reports, SenseTime's large-scale infrastructure has demonstrated cost-efficiency capabilities while empowering leading enterprises across multiple sectors—including industry benchmarks like Xiaomi and China Literature, as well as top-tier institutions such as Shanghai Jiao Tong University.

The deep synergy between "large-scale infrastructure + large models" allows SenseTime to maintain technological leadership while transferring these capabilities to other industries. Officially, SenseTime will unveil version 5.0 of its "RIRIXIN" large model during its April technology exchange event, with expectations that its multimodal capabilities will rival GPT4V.

In simple terms, SenseTime currently serves as both the "electricity provider" and the "railway builder."

As Xu Li, Chairman and CEO of SenseTime, stated: "Generative AI is no longer just a transformative innovation in technology for SenseTime—it has become the company's core business. The growth of SenseTime's generative AI business stems from widespread demand across industries for large model training and inference, marking the official launch of a new cycle in China's hard-tech investments. By deeply integrating generative AI capabilities across all business layers, SenseTime is acquiring new clients while driving comprehensive improvements in efficiency and productivity."

The only thing SenseTime needs to do is stay the course.