What Do AI Companies Like OpenAI Want from Media Conglomerates?

baoshi.rao

Last week, Axel Springer (the German media conglomerate that owns Politico and Business Insider) signed a multi-year licensing agreement worth tens of millions of euros.

Image from Axel Springer

据该公司表示，该协议将「通过添加最新和权威内容的摘要，丰富用户与 ChatGPT 的互动体验。」Axel Springer 的报道文章也将用于训练 OpenAI 的模型。

这种合作安排表面上看似乎并不陌生。过去十多年里，社交媒体公司和更传统的媒体公司已经尝试了数十种技术与媒体合作模式，成效参差。这并非 OpenAI 与媒体组织之间的首次协议：今年早些时候，美联社（AP）已与该公司合作。然而，Axel Springer 的协议是迄今为止最全面的，可能为未来更多类似合作提供模板。

Despite past experiences leading people to view such agreements as part of a broader trend dominated by AI, these partnerships should prompt companies like OpenAI to adopt a more nuanced perspective on the narrative of inevitability. In fact, these collaborations may help explain the practical utility of large language models (LLMs) and lend some justification to the widespread lawsuits filed against AI companies over the past year.

In other words, Axel Springer's agreement—similar to OpenAI's deal with AP or its canceled agreement with Twitter—enables OpenAI to train its models to produce content in the modern style found in Axel Springer's portfolio: direct news in German and English; magazine features; breaking news, investigative reports, and blog posts on various topics; as well as news on U.S. politics and a wealth of internal news. The next GPT model will better mimic such content. If OpenAI or its commercial clients have ambitions to automate news—for example, feeding AI tools the latest reports for contextualization or providing stories for summarization—this agreement could make such prospects more feasible. If executives at companies like Axel Springer envision using OpenAI's software to reduce labor costs—boosting productivity—this could be a step in that direction.

The true value of the deal may lie in ensuring its future products will not exist solely within the vacuum they create themselves.

Some of OpenAI's main competitors, in addition to having access to vast amounts of training data generated by their users, also have access to real-time or at least the latest updated information and media about the world and their customers: Google, through its search engine, owns most of the web's content, not to mention Gmail, Docs, and YouTube; Meta owns Instagram and Facebook; as for Elon Musk's xAI, it owns X.

OpenAI products, like their competitors, can access the internet, which means they can retrieve various relevant and up-to-date content, including news. However, as AI companies grow more powerful, website owners—including news publishers—are becoming more strategic about how they provide their content. In October, the BBC took measures to block OpenAI from scraping its content, joining the ranks of The New York Times, CNN, and Reuters.

OpenAI's deals with publishers are a hedge against a scenario where web scraping becomes more difficult and legally risky, training materials become more expensive, and real-time data becomes scarcer—a situation where paying ChatGPT users might ask for news, and ChatGPT might not have access to credible, up-to-date sources for linking, summarizing, or otherwise conveying information.

In other words, the Axel Springer deal is essentially a concrete prediction of the challenges OpenAI anticipates facing in the coming years, as well as the opportunities it sees in the news industry.

But this deal also raises another question: If the internet is to be harvested by companies that only reward spam, and companies like Axel Springer are destined to become news agencies for automated news aggregators—if OpenAI aims to capture and automate the profitable aspects of news distribution like previous social platform 'partners,' while leaving the costly, difficult, and risky parts of media production to its partners—shouldn't major media companies demand more?