OpenAI's New Model Development Hits a Snag: Is Sparsity the Key to Reducing Costs for Large Models?
-
The training and operational costs of large models are extremely high, and OpenAI has attempted to reduce costs, but unfortunately failed.
At the end of last year, when ChatGPT caused a global sensation, OpenAI engineers began developing a new AI model codenamed Arrakis. Arrakis was designed to enable OpenAI to run chatbots at a lower cost.
However, according to insiders, by mid-2023, OpenAI had canceled the release of Arrakis because the model's operational efficiency did not meet the company's expectations.
Image source: AI-generated image, licensed by Midjourney.
This failure means OpenAI has lost valuable time and needs to shift resources to developing different models.
For collaborative investments, the Arrakis development plan was highly valuable for negotiations between the two companies regarding a $10 billion investment and product deal. According to a Microsoft employee familiar with the matter, the failure of Arrakis disappointed some Microsoft executives.
More importantly, the failure of Arrakis suggests that the future development of AI may be fraught with unpredictable pitfalls.
What kind of model was Arrakis?
Insiders revealed that OpenAI intended Arrakis to be a model with performance comparable to GPT-4 but with higher operational efficiency. The key method employed in the Arrakis model was leveraging sparsity.
Sparsity is a machine learning concept that has been publicly discussed and utilized by other AI developers like Google. Google executive Jeff Dean once stated: "Sparse computation will be an important trend in the future."
OpenAI began researching sparsity early on, having released sparse computation kernels as far back as 2017. Arrakis could have allowed OpenAI to more widely deploy its technology, as the company could have supported its software using a limited number of specialized server chips.
Currently, a common method to increase sparsity is through "Mixture of Experts" (MoE) technology. However, UC Berkeley computer science professor Ion Stoica noted: "Generally, the more expert models there are, the sparser and more efficient the model becomes, but this may lead to less accurate results."
Around spring of this year, OpenAI researchers began training the Arrakis model, which involved using advanced computing hardware to process large amounts of data. Insiders said the company expected training Arrakis to be significantly cheaper than training GPT-4. However, the team soon realized the model's performance wasn't meeting expectations. After about a month of troubleshooting, OpenAI's leadership decided to halt the model's training.
On a positive note, OpenAI can integrate its work on Arrakis into other models, such as the upcoming multimodal large model Gobi.
Two sources indicated that Arrakis underperformed because OpenAI attempted to increase the model's sparsity—meaning only part of the model would be used to generate responses, thereby reducing operational costs. The reason why the model worked in early tests but later performed poorly remains unclear.
It's worth mentioning that insiders said OpenAI once considered naming Arrakis publicly as GPT-4 Turbo.
How Important Is Cost Reduction?
For OpenAI, making its models cheaper and more efficient is a top priority amid rising concerns about technology costs and the proliferation of open-source alternatives.
According to insiders, Microsoft uses OpenAI's GPT models to power AI features in Office365 applications and other services, and Microsoft had originally expected Arrakis to improve the performance of these features while reducing costs.
At the same time, Microsoft has begun developing its own LLM, which may operate at a lower cost than OpenAI's models.
Although this setback has not slowed OpenAI's business growth this year, as competition in the LLM field intensifies—especially with tech giants like Google and Microsoft accelerating their R&D—OpenAI could also face a decline in this race.