What More Can AI's 'Magic Box' Unlock in 2024?

baoshi.rao

In the past year of 2023, GPT models opened the 'magic box' for ordinary people to step into the world of AI, while also igniting a global 'arms race' in large models among tech companies worldwide.

With the emergence of large models for text, images, and videos, the application side has been fiercely innovating, leading to a shortage of AI chip resources. Debates over 'AI replacement' and 'risks and doubts' are constantly unfolding. On one side, AI represents humanity's fervent pursuit of advanced productivity; on the other, it reflects concerns about new technologies.

Looking back from 2024, AI is still far from replacing humans, and technological bottlenecks have become apparent: chip technology limits the speed at which artificial intelligence (AI) can progress toward Artificial General Intelligence (AGI); the data used to train large models is treated as a "rare resource," controlled by internet companies; and commercialized AI applications have yet to benefit everyone—domestic ones are not user-friendly, while overseas options are expensive.

If we consider it over a longer time frame, current large AI models are at most equivalent to humans just discovering fire. How to use AI and where to apply it will be the direction of progress in 2024. We also see that AI tech companies like OpenAI and Google have begun developing chips, as the "fire-stealers" are now intensifying their efforts.

Image source note: The image is AI-generated, image licensing provider Midjourney

Internet 'New Continent'

In 2023, the AI field saw many historic firsts, many of which were brought about by ChatGPT.

ChatGPT gave ordinary people their first glimpse of computers' understanding of natural language, making artificial intelligence no longer just a magical "secret weapon" in movies. For the first time, the non-human ChatGPT was listed among the "Top 10 Scientific Figures of 2023."

From its release in November 2022 to now, the attention and impact generated by OpenAI's ChatGPT have surpassed almost all hotspots in the history of information technology: it reached 1 million users in just 2 days and 100 million users in 2 months, breaking TikTok's previous record. Half a year later, the ChatGPT iOS App quickly topped the overall rankings of the Apple App Store upon its release.

For ChatGPT, what is more epoch-making is that it successfully broke the monopoly of tech giants on AI technology, bringing a product that can understand human language to every ordinary person.

ChatGPT is the first highly intelligent conversational system many people have ever encountered. It can write copy, solve math problems, understand both astronomy and geography, comprehend stories, and even grasp internet memes. Although it initially tended to "speak nonsense with a straight face" and generate hallucinations, it corrects itself based on human prompts.

The evolution speed of the general large language model behind ChatGPT is astonishing. In just one year, it has upgraded from GPT-3.5 to GPT-4 Turbo. A little over a month ago, in early November 2023, OpenAI officially announced the GPTs plan during its Developer Day. ChatGPT Plus users can now train their own customized ChatGPT chatbot based on GPT-4, using data they possess or find.

OpenAI has once again amazed the world, with the GPT-5 trademark application already on record. According to information displayed by the U.S. Patent and Trademark Office, GPT-5 offers functionalities including natural language processing, text generation, comprehension, speech transcription, translation, prediction, and analysis.

OpenAI submitted the GPT-5 trademark application

In OpenAI's official roadmap, they will soon launch a GPT Store, allowing users to list their custom-trained AI assistants for others to use for a fee, thereby establishing a new business model. In less than two months, users have created over hundreds of thousands of domain-specific ChatGPT assistants, highlighting their widespread appeal.

In 2024, if OpenAI opens the "GPT Store," another wave of application frenzy will sweep across the internet.

Application Boom

OpenAI's success was like Columbus discovering the New World—it proved to everyone that this path was viable. In just this short year, ChatGPT has directly stimulated an artificial intelligence arms race among global tech companies.

Data shows that as of October this year, within less than a year, China has seen over 250 domestic enterprises and academic institutions develop large-scale AI models with over 1 billion parameters – a figure that excludes foreign-developed models. The AI application market has also exploded: Sensor Tower reports show that in the first half of 2023 alone, AI app downloads surged 114% year-over-year to exceed 300 million, surpassing the total for all of 2022. Meanwhile, in-app purchase revenue from AI applications skyrocketed 175% year-over-year, approaching $400 million.

In this fierce competition, thousands of large models have evolved powerful multimodal capabilities including text-to-image, image-to-image, text-to-video, and image-to-video generation. Just as people were marveling at ChatGPT's eloquence, models like Bard and Claude that understand internet culture emerged in the blink of an eye.

Additionally, several vertical sectors have produced their own 'unicorns.'

In image generation, Midjourney has taken the lead, becoming the strongest text-to-image tool. From its inception to now, in just half a year, Midjourney has progressed to version V6, expanding from initial text-to-image capabilities to include image-to-image and AI-powered image expansion.

Even more astonishing is the fact that behind Midjourney is a team of just 11 people, established only two years ago. With Midjourney's rapid rise in popularity, the team has grown to 40 members and achieved $200 million in revenue this year alone, achieving financial independence early on.

Unlike most VC-backed startups, Midjourney hasn't taken a penny from venture capitalists. "To put it politely, he doesn't need VCs in his life," said Michael Stewart, a partner at Microsoft's venture fund M12.

The development speed of AI applications can be described as "daily updates."

Shortly after Midjourney gained popularity, Runway's Gen-2 took over in the video domain, with only a one-month gap from its predecessor Gen-1. The latest version of Gen-2 can generate 18-second videos based solely on a single prompt and skillfully employs cinematic language. Recently, Gen-2 added an image-to-video feature, allowing users to animate specific areas of an image simply by "painting" over them.

AI is rapidly advancing in the video sector, with newcomers like Pika1.0 and Stable Video Diffusion quickly catching up. The competition among AI video tools is intensifying, with evolution accelerating. Among them, Pika1.0, which was recently launched, already rivals Gen-2 in text-to-video capabilities and has even introduced AI outpainting to videos for the first time.

In 2024, the multimodal potential of large models will continue to be explored by these unicorns, potentially even giving rise to new ones. Text, images, audio, and video—representations of human natural language—will still be meticulously refined by AI tools, which are likely to become more user-friendly and affordable as they scale.

Chip Bottleneck

Tech giants are competing in large AI models, while small companies focus on applications. Amid this competition, Nvidia, the 'water seller,' has reaped enormous profits. Data shows that Nvidia's latest fiscal third-quarter revenue reached $18.1 billion, a 206% year-over-year increase, with net profit soaring to $9.2 billion, a staggering 1259% growth. Meanwhile, OpenAI, the 'pacesetter' of this race, achieved only $1.3 billion in revenue in 2023. Last year, OpenAI's revenue was just $28 million.

In 2023, Nvidia undoubtedly secured its position at the forefront of the AI boom. Chip prices have surged, yet demand still far outstrips supply. Reports indicate that the delivery cycle for Nvidia's H100 chips ranges from 36 to 52 weeks—such a long wait time clearly cannot meet the rapid development needs of AI products. This may also explain why GPT-5 has been delayed for so long.

Computing power, as one of the three key drivers of AI development, is directly impacted by chip shortages, which slow down the evolution of large models. Companies like OpenAI and Google have started developing their own chips to fill the gaps in their large model training needs.

Nvidia's competitors are also rushing to claim their share of the market. Intel and AMD have introduced high-performance AI chips, Gaudi3 and Instinct MI300X, respectively. Microsoft has unveiled its AI acceleration chip, Azure Maia100, while Amazon has released an upgraded accelerator chip for AI systems, Trainium2.

Meanwhile, U.S. chip sanctions have further exacerbated China's chip shortage crisis. To maintain its market presence, Nvidia has been forced to release downgraded versions of its chips with reduced performance.

As we approach the end of 2023, neither domestically nor internationally has a chip emerged that can rival NVIDIA's H100. Despite chip manufacturers operating at full capacity, the shortage of chip resources is expected to persist in the short term. Perhaps only when the issue of AI chip resources is resolved can products like GPT-5 and more advanced, diverse offerings arrive more swiftly.

From a safety perspective, this isn't entirely negative. During this brief respite, humanity has an opportunity to make better choices regarding the direction of artificial intelligence development.

It's worth recalling that when GPT-4 was first released, an open letter signed by thousands of tech elites brought AI safety issues into the spotlight. They collectively called for "pausing the training of AI systems more powerful than GPT-4." As AI expert Gary Marcus put it: "Is it worth risking even a 1% chance of human extinction for the pleasure of conversing with machines?"

Indeed, when ChatGPT was first launched, its disruptive creativity was truly astonishing. Microsoft founder Bill Gates also emphasized the potential for AI to spiral out of control.

Earlier this year, the internet was filled with concerns about 'AI stealing jobs' and 'AI replacing humans,' even giving rise to the meme of 'carbon traitors.' However, a year later, public attitudes towards AI have returned to a more rational state.

We haven't actually seen large-scale unemployment among workers due to AI—at least not domestically, where the widespread application of AI in the workplace remains a future prospect. Meanwhile, AI safety measures by leading companies like OpenAI have had to keep pace, and AI safety regulatory bodies have been established in various countries.

The advancement of AI is inevitable. How to harness this 'fire' ultimately depends on human choices. Some are building furnaces to control the flames, while others are exploring how these flames can light up uncharted territories.

In 2024, safety will remain one of the key themes in AI development, and chips will be crucial for further enhancing productivity. One thing is certain: AI, like the internet, will become an indispensable tool for humanity in the future.