Premium
This is an archive article published on December 29, 2023

Why The New York Times is suing OpenAI and Microsoft, what it could mean for AI and copyright

The New York Times (NYT) has become the first major news publisher to sue OpenAI and Microsoft, the creators of ChatGPT and other popular artificial intelligence (AI) platforms, citing “unlawful” use of copyrighted content.

New York Times, Open AI, Microsoft, Microsoft AI, NYT Open AI, Indian Express world news, World top news, World latest newsThis is a battle that could frame the legal contours around intellectual property (IP) rights in the age of generative AI platforms.

The New York Times (NYT) has become the first major news publisher to sue OpenAI and Microsoft, the creators of ChatGPT and other popular artificial intelligence (AI) platforms, citing “unlawful” use of copyrighted content.

The lawsuit says the defendants largely scrape the NYT’s original content to build their models and manufacture responses. “Defendants seek to free-ride on The Times’s massive investment in its journalism,” the complaint says, accusing OpenAI and Microsoft of using content “without payment to create products that substitute for The Times and steal audiences away from it”. Microsoft has a sizable investment in OpenAI.

This is a battle that could frame the legal contours around intellectual property (IP) rights in the age of generative AI platforms. It is also symbolic of the larger debate on how generative AI platforms could affect people from the creative industry, given that such systems are built on the back of work done by creators of original content, which is then synthesised through an algorithm and presented as fresh information by the AI systems.

Story continues below this ad

Earlier this year, two US authors had also sued OpenAI, claiming in a proposed class action that the company misused their works to “train” ChatGPT.

What is NYT’s main contention against OpenAI and Microsoft?

The lawsuit, filed in the Federal District Court in Manhattan, contends that millions of articles published by the publication were used to train automated chatbots which now compete with the news outlet as a source of reliable information.

NYT has reported that it approached Microsoft and OpenAI in April to raise concerns about the use of its intellectual property and explore “an amicable resolution,” possibly involving a commercial agreement and “technological guardrails” around generative AI products. But the talks had not produced a resolution, the publication said.

The publication also alleges that OpenAI and Microsoft’s large language models, which power ChatGPT and Copilot, “can generate output that recites Times content verbatim, closely summarises it, and mimics its expressive style.” This “undermine[s] and damage[s]” the Times’ relationship with readers, while also depriving it of “subscription, licensing, advertising, and affiliate revenue.”

Story continues below this ad

The “unlawful use” of the paper’s “copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides, and more” to create artificial intelligence products “threatens The Times’s ability to provide that service”, the lawsuit says.

The lawsuit highlights the potential damage to The Times’s brand through A.I. “hallucination”, a phenomenon in which chatbots respond with false information that is then wrongly attributed to a source.

In August, NYT had blocked OpenAI’s web crawler, preventing the company from using content from the publication to train its AI models.

AI and IP rights

Generative AI platforms such as ChatGPT and Google’s Bard have ignited a debate on IP rights over original content on the internet.

Story continues below this ad

The responses that AI platforms such as ChatGPT and Bard generate rest on the bedrock of millions of pieces of textual content that creators, including news publishers, have uploaded online.

The music business, too, is pushing back on the use of AI in the industry. Universal Music Group, for instance, has asked streaming services such as Spotify to stop developers from scraping its material to train AI bots in making new songs.

The debate is gaining traction at a time when countries around the world, including India, have archaic copyright laws that need reimagining keeping the AI wave in mind. For instance, in India, creative works are regulated under the Copyright Act of 1957.

The definition of an “author” under the Act includes any literary, dramatic, musical or artistic work which is computer generated, the person who causes the work to be created. But that definition does not take into account that AI systems do not generate information on their own. They are, simply, only as good as the base dataset on which they are trained. And the base dataset is made by copyrighted work produced by other authors.

Soumyarendra Barik is Special Correspondent with The Indian Express and reports on the intersection of technology, policy and society. With over five years of newsroom experience, he has reported on issues of gig workers’ rights, privacy, India’s prevalent digital divide and a range of other policy interventions that impact big tech companies. He once also tailed a food delivery worker for over 12 hours to quantify the amount of money they make, and the pain they go through while doing so. In his free time, he likes to nerd about watches, Formula 1 and football. ... Read More

Latest Comment
Post Comment
Read Comments
Advertisement
Advertisement
Advertisement
Advertisement