Towards 2024: How Should We Think About AI Startup Investments

baoshi.rao

In December 2023, Nature magazine released its annual '10 People Who Mattered in Science' list, which for the first time in history included a 'non-human'—ChatGPT. Nature pointed out: 'Although ChatGPT is not an individual and does not fully meet the selection criteria, we decided to make an exception to acknowledge that generative artificial intelligence is fundamentally changing the trajectory of scientific development.'

▲ Image source: Nature

On the technological landscape of 2023, generative AI undoubtedly marks a crucial turning point. Its development has not only garnered widespread attention across industries but has also profoundly influenced the global economy, societal structures, and even our expectations for the future.

This is an AI revolution that everyone can participate in. From the continuous advancement of large language models to the broad application of AI technologies across various sectors, and the ongoing competition between open-source and closed-source strategies, every step in AI's progress is sketching the contours of future trends.

Facing the surging wave of technological advancement, the Chinese government has introduced a series of policy measures to support AI development in the "14th Five-Year" National Informatization Plan and the "Guidance on Accelerating Scenario Innovation to Promote High-Quality Economic Development through High-Level AI Applications". China's AI industry has rapidly expanded, giving rise to a group of internationally competitive AI enterprises.

As the year draws to a close, we review the development of generative AI in 2023, discussing its impact on humanity, industry landscape and future trends, as well as entrepreneurial and investment opportunities. This is not only a retrospective on the past year's developments in the AI field but also a reflection on the direction of AI's future progress.

/ 01 /

2023年，AI领域有哪些新变化？

AI发展至今，从业界的角度来看，可以分为两个阶段：**1.0阶段主要集中于分析和判断，而2.0阶段更侧重于生成。**2.0阶段的代表模型是大型语言模型和图像生成模型，Transformer和Diffusion Model这两个算法模型推动着生成式AI的发展。

For most of 2023, OpenAI's products consistently topped the performance charts for large language models, particularly after the release of GPT-4 in March, which left competitors far behind. However, Google successfully launched its latest large language model, Gemini, in December, creating a competitive duopoly with GPT-4.

In the field of AI, the open-source model community has remained active. Supported by Meta's (formerly Facebook) open-source large language models LlaMa and LlaMa2, the community has been engaged in intensive research and engineering iterations. For example, efforts include attempting to achieve capabilities similar to larger models with smaller ones, supporting longer contexts, and adopting more efficient algorithms and frameworks for model training.

Multimodal (images, videos, and other multimedia forms) has become a research hotspot in the AI field. Multimodal involves both input and output aspects. Input refers to enabling language models to understand the information contained in images and videos, while output refers to generating other media forms beyond text, such as text-to-image. Considering that human capabilities to generate and acquire data are limited and may not sustain AI training in the long term, future training of language models may require using data synthesized by AI itself.

In the field of AI infrastructure, Nvidia has become an industry leader and joined the $1 trillion market capitalization club due to the enormous demand for its GPUs. However, it also faces fierce competition from traditional rivals like AMD and Intel, as well as major players like Google, Microsoft, OpenAI, and emerging language model innovators.

In addition to large models, there is a strong demand for various types of AI applications in the industry. Generative AI has made significant progress in multiple fields such as images, videos, programming, voice, and intelligent collaboration applications.

Global users have shown great enthusiasm for generative AI. ChatGPT reached 100 million monthly active users in just 2 months. Compared to the super apps of the smartphone era, which required significant promotional budgets, TikTok took 9 months, Instagram took 2.5 years, WhatsApp took 3.5 years, and YouTube and Facebook took 4 years to achieve the same milestone.

▲ Time taken for different types of tech applications to reach 100 million monthly active users.

![Image Source: 7 Global Capital](Image source placeholder)

Venture capital firms are also investing heavily to support advancements in the AI field. According to statistics from the US investment firm COATUE, as of November 2023, venture capital investors have poured nearly $30 billion into the AI sector. Approximately 60% of this investment has gone to emerging large language model companies like OpenAI, about 20% to the infrastructure supporting and delivering these models (AI cloud services, semiconductors, model operation tools, etc.), and around 17% to AI application companies.

▲ Image source: COATUE

Before the ecosystem of truly valuable AI applications flourishes, this investment logic of betting on core technology sources and 'shovel-selling' companies holds some merit. However, the currently thriving AI applications are equally a source of value creation and the vast frontier we aspire to explore.

▎Multiple technological breakthroughs emerge in the field of multimodal generation

In 2022, following the open-sourcing of Stable Diffusion, we witnessed a surge in 'text-to-image' products. This year can be considered as the year when the problem of image generation was solved.

Then in 2023, significant advancements were made in AI technologies for recognizing and producing audio. Today, AI's speech recognition and synthesis technologies have become highly sophisticated, making synthetic voices nearly indistinguishable from human ones.

With the continuous development of technology, video generation and processing will be the next major focus in AI advancement. Currently, there have been several breakthroughs in the field of 'text-to-video' generation, showcasing the potential and possibilities of AI in video content creation. With the help of emerging AI video models and applications like Runway Gen-2, Pika, and Stanford's W.A.L.T, users can now generate video clips simply by describing the desired image.

NVIDIA's renowned engineer Jim Fan believes that in 2024, AI is highly likely to make significant progress in the video domain.

▲ Image source: X.com

If we consider different forms of media formats from another dimension, a two-dimensional image becomes a video when a time dimension is added. If a spatial dimension is added, it becomes 3D. By rendering 3D models, we can obtain more precisely controllable videos. In the future, AI may gradually conquer 3D models as well, but that will require more time.

"Compression as Intelligence"

In 2023, OpenAI's Chief Scientist Ilya Sutskever shared an external perspective suggesting "compression as intelligence" - the idea that the higher a language model's compression ratio of text, the greater its level of intelligence.

While "compression as intelligence" may not be entirely rigorous, it offers an explanation that aligns with human intuition: the most extreme compression algorithms, in order to compress data to the utmost, must necessarily achieve a full understanding and abstract higher-level meaning.

Taking Llama2-70B, a language model developed by Meta, as an example, it is the 70-billion-parameter version of the Llama2 model and one of the largest open-source language models currently available.

Llama2-70B uses approximately 10T (10 trillion) bytes of text as training data. The trained model is a 140GB file, with a compression ratio of about 70 times (10T/140G).

In daily work, we usually compress large text files into Zip files, with a compression ratio of about 2 times. In comparison, the compression strength of Llama2 is remarkable. Of course, Zip files are lossless compression, while language models are lossy compression, so they are not directly comparable.

▲ Screenshot shared by OpenAI VP Andrej Karpathy.

Image source: Web3 Sky City

The remarkable aspect is that a 140GB file can store human knowledge and intelligence. Most laptops can accommodate 140GB files. When a laptop has sufficient computing power and VRAM, just adding a 500-line C program can run a large language model.

/ 02 /

Open Source Ecosystem and the Traffic Tax of Large Language Models

▎Open research and the open-source ecosystem are important forces driving the development of AI

Open source ecosystem promotes AI technology innovation. Image source: Coatue.com

Open research is the foundation of AI technology development. The world's top scientists and engineers publish numerous papers on websites like Arxiv, sharing their technical practices. Whether it's the early AlexNet convolutional neural network model, Google's Transformer that laid the algorithmic foundation, or the model practice papers published by companies like OpenAI and Meta, these represent significant breakthroughs in research and technology, leading the development of AI technology.

The development and iteration of open-source communities are especially worth paying attention to. With the support of open-source large language models, researchers and engineers can freely explore various new algorithms and training methods. Even closed-source large language models can learn from and draw inspiration from the open-source community.

It can be said that open-source communities have achieved a certain degree of technological democratization, allowing people worldwide to share the latest advancements in the field of AI.

The 'Traffic Tax' of Large Language Models

Returning to the essence of business, the training costs of large language models are extremely high. Taking GPT as an example, according to statistics from Yuanchuan Research Institute, training GPT-3 cost over $10 million, while training GPT-4 exceeded $100 million. The cost for the next-generation model could potentially reach $1 billion. Additionally, the computational and energy consumption required to run these models and provide services to the public is also very expensive.

The business model for large language models is MaaS (Model as a Service), where the billing method for intelligence output is based on the volume of input and output (referred to as tokens). Given the high training and operational costs of large language models, the token fees are likely to rise accordingly.

Image source: openai.com

Taking OpenAI as an example, the image above shows part of the official pricing scheme for model API usage. Rough estimates suggest that for median-level GPT-3.5 Turbo API usage, an app company would need to pay OpenAI approximately ¥0.20 RMB per daily active user (DAU). Extrapolating this, if an app with tens of millions of DAUs integrates GPT APIs, the daily traffic cost could reach ¥2 million RMB.

Image source: WeChat Official Account <a class= @AI Empowerment Lab" class=" img-fluid img-markdown" />

The traffic pricing for domestic large models is as shown in the figure above, which is roughly equivalent to OpenAI's pricing. Some small and medium-sized models are cheaper, but there is a performance gap.

Traffic costs will influence how AI applications design their business models. To reduce the burden of traffic costs, some startups consider leveraging the capabilities of the open-source ecosystem to build their own small or medium-sized models to handle most user demands. If user requests exceed the capabilities of these smaller models, they then call upon large language models.

These small and medium-sized models may be directly deployed on end-user devices as 'edge models'. Edge models heavily test hardware integration capabilities. In the future, our computers and smartphones may more widely incorporate hardware chips like GPUs, enabling the ability to run small models locally on devices. Google and Microsoft have already released small models that can run on edge devices. Nano is the smallest version of Google's Gemini large model, specifically designed to run on mobile devices without internet connection, capable of local and offline operation.

/ 03 /

How is AI Impacting Human Society?

▎Every technological revolution brings new efficiency tools

There have been several major technological revolutions in human history. The First Industrial Revolution, which emerged around 1760, introduced mechanical equipment; the Second Industrial Revolution after 1860 brought electronic devices; and after 1970, we experienced three more technological innovations—computer software, PC internet, and smartphones—collectively referred to by some as the Third Industrial Revolution or the Information Revolution.

The generative AI revolution that began in 2023 may be called the Fourth Industrial Revolution, as we have created new intelligence. Generative AI is a new tool for human cognition and transformation of the world, and it has become a new abstract tool layer.

Historical experience shows that every technological revolution significantly enhances human productivity. After the First and Second Industrial Revolutions, two abstract tool layers emerged in the natural world: mechanical and electronic devices. In the 1970s, the information technology revolution, represented by computers, introduced a new abstract layer—software. Through software, people began to understand, transform, and interact with the world in a more efficient way. Later, the rise of PC internet and smartphones further advanced software technology.

How does AI affect people's work?

While focusing on the efficiency improvements brought by AI, we must also pay attention to how machines are replacing human jobs. According to statistics, before the first Industrial Revolution in Britain, the agricultural population accounted for about 75%, which dropped to 16% after the revolution. After the Information Revolution in the U.S., the industrial population decreased from 38% to 8.5%, with most of those workers transitioning to white-collar jobs. This time, the AI-driven intelligent revolution is primarily impacting white-collar workers.

With advancements in AI technology, the organizational forms and collaboration methods in the business world may undergo a series of changes.

First, companies may trend toward smaller sizes. Business outsourcing could become very common. For example, companies might outsource R&D, marketing, and other functions.

Next comes workflow reconstruction, meaning Standard Operating Procedures (SOP) may undergo changes. As individuals possess varying capabilities and energy levels, optimized workflows enable people to improve efficiency and focus on their respective responsibilities. Researchers are exploring how to adjust human workflows when AI potentially replaces certain functions. Current language models also have areas for efficiency improvement and capability enhancement - these models may likewise require workflow orchestration to achieve collaborative potential.

Beyond technical skills, developing complementary abilities has become crucial. For instance, refining aesthetic judgment and taste allows AI to better assist you in generating superior solutions or creative works. Similarly, strengthening critical thinking skills helps you better evaluate and discern AI-generated content.

We should more actively utilize AI, treating it as an assistant in work and life, or as a co-pilot, to fully leverage its potential and advantages.

AI has its boundaries

In the current rapid development of AI, many have raised concerns about the threats posed by AI, worrying about its negative impacts on humanity. Indeed, humans have invented tools that appear to be smarter than themselves. Controlling such 'silicon-based life forms' like AI is undoubtedly a significant challenge for humanity. Scientists are working to address this issue, and OpenAI has also published papers discussing similar problems.

However, we shouldn't be too pessimistic. At least for now, the degree of digitalization in human society can limit the boundaries of AI's capabilities.

Today's large language models are primarily trained on vast amounts of text data. Text has a high degree of digitalization and, after human abstraction, carries high information density, which makes AI training very effective.

But once outside the text domain, AI's intelligence faces many limitations because it hasn't been trained on corresponding data. So for now, we don't need to worry too much—AI isn't as powerful or comprehensive as some fear. We have ample time to familiarize ourselves with and adapt to it, finding ways to coexist harmoniously with this silicon-based intelligence.

/ 04 /

Looking Ahead to 2024,

How Will Large Language Models and AI Applications Evolve?

Leading Large Language Model Camp

Globally, large language models exhibit distinct regional development characteristics. For instance, the U.S. and China each follow unique development paths. In the U.S., the leading large language model camp has largely been established, primarily concentrated within a few major tech companies or their collaborations with top model startups. It can be said that the U.S. AI field has entered a high-cost arms race phase, making it difficult for new entrants to break into the market.

China's large language models are showing a flourishing landscape, with over a hundred projects claiming to be in development. China may rely more on the open-source ecosystem to develop new language models through secondary development.

Currently, no country outside the United States has developed a large language model equivalent to GPT-4. In the field of large model technology, there is still a gap between China and the United States.

But the global competition in the field of AI is far from over. For China, the most important thing is to vigorously develop the AI application ecosystem. In the era of the internet and digital economy, China has excelled in application fields and has exported related application practices overseas. By keeping up with the latest large model technologies and then achieving technological breakthroughs after the application ecosystem flourishes, this might be a viable solution.

How will large language models develop?

Although numerous technological breakthroughs have been achieved in the field of large language models, there are still many areas that can be iterated and improved, such as reducing "hallucinations," increasing context length, achieving multimodality, embodied intelligence, performing complex reasoning, and enabling self-iteration.

First, let's discuss the phenomenon of "hallucinations." Hallucinations can be understood as incorrect outputs, which Meta defines as "confident falsehoods." The most common cause of hallucinations is insufficient density in the knowledge or data collected by the language model. However, hallucinations can also be seen as a manifestation of creativity, much like how poets can write beautiful verses after drinking—AI hallucinations might also bring us fascinating content.

There are many ways to reduce hallucinations, such as training with higher-quality corpora, improving model accuracy and adaptability through fine-tuning and reinforcement learning, and incorporating more contextual information into the model's prompts to help it understand and respond to questions more accurately.

Second, increase context length. Context length is akin to the brain capacity of a language model, currently typically 32K, with a maximum of 128K—equivalent to less than 100,000 Chinese characters or English words. If we want language models to understand complex linguistic texts and handle intricate tasks, this length is still far from sufficient. The next generation of models will likely focus on expanding context length to enhance their ability to tackle complex tasks.

Third is multimodality. Humans primarily rely on vision to acquire information, whereas current language models mainly depend on textual data for training. Visual data can help language models better comprehend the physical world. In 2023, visual data was incorporated into model training at scale. For example, GPT-4 introduced multimodal data, and Google's Gemini model is said to have utilized vast amounts of image and video data. From the performance shown in Gemini's demo videos, its multimodal interaction appears to have improved significantly, though enhancements in complex reasoning and other intellectual capabilities are not yet evident.

The fourth is embodied intelligence, which refers to an intelligent system based on a physical body for perception and action, capable of acquiring information from the environment, understanding problems, making decisions, and taking action. This concept is not that complicated; all living organisms on Earth can be considered embodied intelligence. For example, humanoid robots are also regarded as a form of embodied intelligence. Embodied intelligence essentially extends AI with movable "limbs."

The fifth is complex reasoning. Typically, GPT provides answers in one go, without obvious multi-step reasoning or iterative backtracking. However, when humans think about complex problems, they often list steps on paper and repeatedly deduce and calculate. Researchers have explored methods, such as leveraging thinking models like the Tree of Thoughts, to teach GPT how to perform complex multi-step reasoning.

Finally, there's self-iteration. Currently, language models primarily rely on humans to design algorithms, provide computing power, and feed them data. Looking ahead, can language models achieve self-iteration? This may depend on new model training and fine-tuning methods, such as reinforcement learning. It is said that OpenAI is experimenting with a training method codenamed "Q*" to explore how AI can self-iterate, but the specific progress remains unknown.

Large models are still in a period of rapid development with significant room for improvement. Beyond the points mentioned above, there are many areas that need to be addressed and enhanced, such as interpretability, improving safety, and ensuring that outputs align more closely with human values.

Future Application Software—AI Agent

In September 2023, Sequoia Capital published an article titled Generative AI’s Act Two on its official website, stating that generative AI has entered its second phase. The first phase primarily focused on the development of language models and their surrounding simple applications, while the second phase shifts the focus to researching and developing intelligent new applications that truly address customer needs.

Future application software may gradually transition to AI Agents—intelligent software capable of autonomously executing tasks, making independent decisions, actively exploring, self-iterating, and collaborating with others. Existing traditional software may require corresponding adjustments and improvements. Compared to traditional 1.0 version software, AI Agents can provide a more realistic, high-quality one-on-one service experience.

However, the difficulty in developing AI Agents lies in the fact that language models are currently too immature and unstable. To deliver a good application experience, it's necessary to supplement language models with smaller models, rule-based algorithms, and even human services in certain critical steps, thereby providing stable performance in vertical scenarios or specific industries.

Multi-agent collaboration has become a hot research direction. Based on standard operating procedures, multiple collaborating AI Agents can achieve better results than individually calling language models. There's an intuitive explanation for this: each Agent may have its own strengths, weaknesses, and specialized focus—similar to human division of labor. When combined, they perform their respective roles through new standard operating procedures (SOPs), inspiring and supervising each other in collaboration.

/ 05 /

Entrepreneurship and Investment Opportunities

▎In non-consensus areas, do what's right rather than easy

In a new era, as a startup, it is essential to seriously contemplate what native new business model opportunities arise from this technological revolution. At the same time, one must consider which opportunities belong to newcomers and which belong to existing industry leaders.

We can look back at the two technological transformations of PC internet and smartphones to see how they created new opportunities.

During the PC internet era, the primary capability provided was connectivity—global PCs, servers, and other devices became interconnected. The native new models born in the PC era included search engines, e-commerce, and social communication, giving rise to leading companies like BAT across various industries.

In the smartphone era, the primary capability provided is that most people own a mobile device equipped with features like mobile internet, GPS, and cameras. This foundational condition has enabled new models such as the sharing economy, instant messaging, short video sharing, and mobile financial payments. Industry leaders from the previous era had strong first-mover advantages, capturing many of these new opportunities—for example, Tencent and Alibaba created WeChat and Alipay, respectively. However, we've also witnessed the remarkable success of newcomers like Meituan, Douyin, and Didi. How did they achieve this?

I believe the key to their success lies in focusing on non-consensus areas and doing what's right rather than what's easy.

Take Meituan and Douyin as examples. Meituan's chosen native new model is called "Food Delivery," which belongs to the "O2O (Online to Offline)" part of the "Sharing Economy." On the left are numerous restaurants, and on the right are various consumers, with thousands of delivery riders in between. This is a "heavy model," but early internet giants preferred and excelled at "light models." Entering the food industry was a "non-consensus" move. The fulfillment service chain for food delivery is too long and difficult to digitize, making it hard to operate with precision. However, Meituan ultimately succeeded, and these challenges became its greatest core advantages and competitive barriers.

Now look at Douyin. Its chosen native new model is called "Short Video Sharing," which was part of the then-popular "Creator Economy." Douyin's biggest "anti-consensus" move was bridging the gap between the video creator economy and the trillion-scale e-commerce GMV, achieving large-scale and efficient conversions.

Before the rise of e-commerce livestreaming, there were two types of livestreaming: one was gaming livestreaming, and the other was influencer livestreaming, with monetization primarily relying on viewer donations. This monetization model had a very small economic scale and couldn't accommodate so many talented creators. However, Douyin, through recommendation algorithms, developing creator and merchant ecosystems, establishing the Douyin Shop closed loop, and optimizing content-to-e-commerce conversion, successfully built this massive commercial loop of converting content into e-commerce. Once this was achieved, Douyin could invite the largest number of top creators nationwide to produce content on its platform, rewarding them with significant e-commerce sales revenue.

As a result, when Douyin's international version, TikTok, expanded overseas, many local short-video and livestreaming platforms couldn't compete. This is because TikTok isn't just a video content platform connecting creators on one side and consumers on the other—it's a new hybrid of creator economy and massive e-commerce GMV conversion, a novel entity with composite competitive advantages.

In summary, startups should dare to choose and enter non-consensus fields, striving to succeed in challenging environments.

Entrepreneurial Directions and Key Points

From the perspective of entrepreneurial directions, the large model domain is dominated by tech giants, making it unlikely to be the first choice for entrepreneurs. Between large models and applications lies a 'middle layer', primarily consisting of infrastructure, application frameworks, model services, etc. This layer is susceptible to pressure from both models and applications, with many areas already crowded by major players, leaving limited space for startups.

In summary, we tend to believe that, considering the current technological and commercial environment, we should vigorously develop the AI application ecosystem.

The above shows the generative AI-related startups we have invested in, including: a new DevOps platform designed for language models, a social gaming platform, intelligent companionship services, AI-assisted RNA drug development, automated store marketing, a global intelligent commercial video SaaS, a new online psychological counseling platform, and a remote hiring platform for engineers between China and the US, among others.

We have summarized several key points for entrepreneurship in the AI application field:

First, create high-quality native applications. Leverage the new capabilities offered by the AI era—intelligent and artistic creation—to deliver unique and superior native application experiences. This is no easy task. As mentioned earlier, the intelligence of language models is not yet mature or stable, with clear limitations. Startups may need to focus on relatively niche scenarios, employing various technical and operational methods to achieve a great user experience.

Second, embrace non-consensus, forward-thinking, and disruptive approaches. Non-consensus means not following the crowd in choosing your path but daring to enter challenging areas—doing what is right rather than what is easy. Forward-thinking involves selecting ambitious business and technical routes.

For example, adopting more advanced and still-developing technological architectures, such as: entrepreneurs should prioritize building Agents over CoPilots, as CoPilots are more suited for industry leaders (think Microsoft and Github). Additionally, startup teams could consider designing applications based on the capabilities of next-generation language models like GPT-5.

Disruptive innovation refers to creating a significant impact on the targeted industry, such as revolutionary product experiences or transforming existing business models. The advantage of such disruption is the potential to outpace industry leaders. For instance, Fengrui Capital's investment in Babel (Babel Technology) leverages emerging trends like 'Serverless' and large language models to redefine software development tools and production factors, enabling AI to handle programming, debugging, deployment, and operations.

Third, focus on user growth and commercialization potential. The importance of user growth potential is easy to understand—even if you start with a niche market, you can eventually scale it into something much larger.

Why should we focus on commercialization in the early stages?

This brings us back to the 'traffic tax' of large models we mentioned earlier. If you choose to integrate with a large model, from the very first day of your startup, you will have to pay this traffic tax to the large model.

For consumer-facing applications, there are typically three main paths to commercialization at scale: direct user payments (such as in games and premium services), advertising, and e-commerce. Only a very few applications can successfully build an e-commerce business (like Taobao or Douyin). It's challenging for new apps to charge users directly, as most entrepreneurs are hesitant and prefer more indirect approaches, aiming to monetize through advertising after achieving significant user growth.

From the smartphone era's perspective, apart from e-commerce apps, China's top general information apps likely earn between 0.1 to 0.3 yuan per daily active user from advertising—this already represents the peak of advertising monetization. For average-sized apps, the revenue might not even reach 0.1 yuan.

As previously discussed regarding the "traffic tax" of language models, the daily cost per user is about 0.2 yuan, which advertising revenue typically struggles to cover. The larger the user base, the more severe the losses become—unless measures like edge-side models are implemented to reduce this "traffic tax."

Therefore, AI applications may need to prioritize forward charging in their business model design. Of course, in this new era of AI intelligence, perhaps our entrepreneurs can find other commercialization avenues beyond the three mentioned models. Let's wait and see.

Fourth, seize the dividends of macro trends. It's essential to anticipate and capitalize on China's macro trends, such as cross-border e-commerce, video commerce, and the engineer dividend. We must strive to capture the β of our era.

Fengrui Capital's portfolio company, Tekan Technology, is also seizing opportunities in new trends like China's cross-border e-commerce and innovative video commerce. It aims to build a world-class commercial video SaaS platform through product innovation, empowering overseas video entrepreneurs and merchants.

Fifth, maintain a safe distance from large models and have your own business depth.

The concept of a safe distance should be familiar to many, with several well-known negative examples overseas. For instance, some commercial application companies that generate copywriting have achieved rapid growth that was "short-lived," ultimately unable to escape the dual impact of large models and other startups. Additionally, the business depth of a startup project is crucial. This business depth refers to areas that large models cannot reach, particularly scenarios that are difficult to digitalize or not fully digitalized.

Of course, the most important factor is the team—strong technical skills and team members who understand the industry and scenarios, embodying the principle of "technology first, scenarios foremost."