With the Advent of Sora, Will Short Videos Really Improve?
-
At the beginning of last year, ChatGPT burst onto the scene, instantly igniting the global tech community. This year, the tech world has once again been rocked by major news—OpenAI has released its text-to-video model, Sora. Sora can generate videos up to 60 seconds long based on text prompts, and judging from the publicly shared samples, the results are nothing short of stunning.
From text generation to text-to-video generation, it has only been a year. No one could have anticipated how rapidly generative AI would develop and evolve in such a short time.
Unlike ChatGPT, the impact of Sora is not confined to the tech sector. The entire content industry, including film production and short videos, is paying unprecedented attention to Sora. As a text-to-video model, Sora's emergence seems tailor-made for the film and video industry. However, compared to long-form videos, short videos are currently feeling the pressure more acutely. Sora can already generate videos up to 60 seconds long, which can be directly uploaded to short video platforms.
With the arrival of Sora, will the short video content and platforms undergo their biggest shake-up yet?
Will It Eliminate Content Trash or Trigger an Explosion of Low-Quality Content? Sora isn't the first text-to-video model, as AI video generation pioneers like Runway and Pika had previously released models limited to about 10 seconds per generation. Sora's debut immediately pushed this limit to 60 seconds.
This breakthrough means Sora-generated videos can carry more information and richer content, fully meeting short-form video platform requirements. More impressively, sample videos demonstrate professional-level cinematography - whether in camera movement, composition, scene setting, or transitions between wide shots, medium shots, close-ups and details - surprising even industry experts with visual quality that may surpass content created by average human creators.
But is this development ultimately beneficial or detrimental to regular content creators? The answer depends on content type.
Currently, Sora appears most disruptive for content categories like viral challenges, landscapes, narratives, and trend-chasing pieces. These formats typically lack distinctive personal styles or tones, with weak interactivity and low audience loyalty. Using Sora for such content not only significantly reduces production costs but also outperforms human creators in speed and efficiency - making AI-generated content likely to dominate these categories once Sora becomes widely adopted as a tool. In contrast, content on short video platforms that exhibits strong personal style, originality, or emotional value cannot be easily replicated by competitors—nor can Sora readily generate such material.
For instance, emotional dramas, relationship mediation content, or similar genres—though increasingly exaggerated or scripted—still resonate because audiences project genuine feelings onto them, believing what they watch is authentic and moving. If they knew the content was AI-generated, it would naturally lose its appeal.
The advent of large text-to-video models disadvantages low-barrier, replicable, trend-chasing homogeneous content while favoring high-quality, distinctive, and irreplicable works that stand out. Yet on short video platforms, the former dominates, especially content catering to primal desires through endless replication, flooding platforms with garbage.
Models like Sora may offer ordinary creators tools to break free from homogeneity, allowing those with ideas but lacking professional skills to focus on innovation. Conversely, low-quality, vulgar content could also proliferate via AI, inherently attracting massive traffic and spreading faster and wider under profit-driven incentives. When graphic and text-based self-media first emerged, spam accounts quickly rose to prominence, mass-producing low-cost and low-quality content to exploit the era's dividends. Due to the higher production costs of short videos, it was difficult to rely on creating vast numbers of spam accounts to capture traffic, leading to the decline of this gray industry. However, with the emergence of Sora, spam accounts may now have a chance to make a comeback.
An era of account farming seems to be stirring once again.
Are MCNs Being 'Killed,' and Is the Content Industry Chain Being Reshaped?
In recent years, as the content ecosystem of short video platforms has matured, video production has become more sophisticated and industrialized. The PGC (Professionally Generated Content) model has complemented the UGC (User-Generated Content) model that initially defined short videos, with the fusion of these two models driving content prosperity. The growth of the PGC model relies heavily on the rise of MCN (Multi-Channel Network) agencies, which platforms have vigorously supported. These agencies deeply involve themselves in content production, maintaining a steady supply of content through industrialized methods.
Industry insiders reveal that Douyin (TikTok) video production has long been industrialized. The investment cost for an MCN to operate a commercially viable Douyin account is approximately 200,000 yuan, covering expenses such as filming, production, personnel costs, and traffic purchases—with the bulk of the expenditure going toward buying Dou+ exposure. MCN institutions have played a pivotal role in the production and monetization of short video content. However, in recent years, conflicts between MCNs and top influencers have become increasingly apparent, posing significant challenges to the MCN-dominated content industry chain. Now, with text-to-video AI models like Sora entering short video platforms, the first major change is the lowered threshold for content shooting and production, which seems to be undermining the value proposition of MCN institutions.
In other words, if content creators can achieve their desired results without relying on MCN institutions, do they still need to be constrained by these organizations?
Take the popular micro-dramas from last year as an example. Currently, only a few micro-dramas on platforms manage to balance production quality with overall viewing experience. These are typically led by Douyin, Kuaishou, top MCNs, or small-to-medium film companies, often adapted from IPs of platforms like Midu or Tomato Novels, or created by established screenwriters.
Among these, MCN institutions stand out the most. Hot short drama rankings on Kuaishou and Douyin show that leading regional MCNs still dominate the top 10 spots in short drama charts. With the advent of Sora, discussions have surged about the possibility of transforming novel scenes into videos, or even converting entire novels into video format. While this remains speculative, Sora can already "generate complex scenes with multiple characters, specific types of motion, and precise subject and background details," allowing individuals without any experience in film production or artistic design to create videos that match their descriptions. This capability clearly aids in the production of micro-short dramas, reducing the need to rely solely on the expertise of MCN agencies or film companies.
Small and medium-sized MCN agencies are more likely to be abandoned by short video creators. On short video platforms, many of these agencies lack the ability to incubate top influencers or accounts. Instead, they cast a wide net, relying on tactics like chasing trends or even imitation and plagiarism to mass-produce accounts and content.
Now, with the efficiency of text-to-video models rivaling that of small and medium-sized MCN agencies, why would creators still need to sign contracts with these agencies?
The rise of MCN agencies has elevated the professionalism of content on short video platforms. However, if ordinary creators can produce large amounts of professional content using text-to-video models, the survival of MCN agencies will undoubtedly be challenged. This could herald a new wave of content iteration. Is Fake Content About to Flood the Internet?
When a viral video emerges, users rush to post similar content to compete for traffic, leading to homogenization on platforms. This saturation of repetitive content can diminish users' interest in short videos over time. This is one of the biggest issues facing short video platforms. Another major problem is the prevalence of fake content, where performances, exaggerations, or outright fabrications are increasingly common on platforms like Douyin and Kuaishou.
For example, emotional mediation live streams often feature scripted, sensational stories. Behind these broadcasts lies a well-established industry chain, complete with scriptwriters, actors, and even actor training services.
Health and medical science short videos are particularly problematic. A 2021 study by the Chinese Academy of Sciences found that over half of short videos in this category lack authoritative sources, with only 42% of creators ensuring that 'all or most of their videos are backed by credible sources.' Text-to-video foundation models will undoubtedly revolutionize content production in terms of efficiency, quality, and cost. However, they also enable the rapid and inexpensive creation of online misinformation, making it increasingly difficult for users to discern authenticity. As models like Sora continue to evolve rapidly, AI-generated content will become more realistic—so convincing that even tech-savvy young adults might struggle to distinguish it from reality, let alone elderly users with limited digital literacy.
These concerns have already materialized in real-world incidents. In April last year, fake news about "a train collision killing 9 maintenance workers in Gansu this morning" surfaced. May saw fabricated short videos claiming "a loan shark poisoned 4 victims in Anqiu, Weifang, Shandong." Then in July, another AI-generated short video falsely reported "a major industrial fire in Shangyu, Zhejiang."
With AI demonstrating increasingly sophisticated capabilities, fabricating convincing misinformation has become disturbingly effortless.
This shifts the burden onto platforms like Douyin and Kuaishou. Beyond capitalizing on the content creation opportunities presented by text-to-video models in the wake of Sora's technological wave, these platforms must now implement stronger safeguards against AI misuse. The challenge demands more advanced technical solutions—ironically, the most effective approach for detecting deepfakes may still rely on AI itself, essentially using artificial intelligence to identify artificial intelligence. Just like OpenAI, OpenAI has stated that it is conducting relevant research, including developing text classifiers, image classifiers, and other tools to detect misleading content.
However, domestic internet giants like ByteDance have already fallen behind. Zhang Nan, the former CEO of Douyin who transferred to Jianying, originally planned to launch an AI-generated image and video product. But before the product could be released, the emergence of Sora and its stunning performance instantly multiplied the pressure on him. Of course, the question of when a Chinese version of Sora can be created is also a shared pressure among domestic internet giants.
Under the sudden impact of Sora, the time left for ByteDance and Kuaishou to incubate the next AI video generation unicorn is becoming increasingly tight.