Authors sue Claude AI chatbot creator Anthropic for copyright infringement

A group of authors are filing a lawsuit artificial intelligence startup Anthropic, which alleges the company committed “large-scale theft” by training its popular chatbot Claude on pirated copies of copyrighted books.

While similar lawsuits have been fighting against competitor OpenAI, the maker of ChatGPT, for over a year. This is the first time that writers have taken aim at Anthropic and its Claude chatbot.

The smaller San Francisco-based company, founded by former OpenAI leaders, bills itself as the more responsible, security-focused developer of generative AI models that can compose emails, summarize documents and communicate naturally with people.

But the lawsuit filed Monday in a federal court in San Francisco alleges that Anthropic’s actions “made a mockery of its lofty goals” by using repositories of pirated text to build its AI product.

“It is no exaggeration to say that Anthropic’s model seeks to profit by dissecting the human expression and ingenuity behind each of those works,” the lawsuit says.

Anthropic did not immediately respond to a request for comment on Monday.

The lawsuit was filed by three writers – Andrea Bartz, Charles Graeber and Kirk Wallace Johnson – who seek to represent a group of fiction and nonfiction writers in a similar situation.

While this is the first case brought against Anthropic by authors, the company is also fighting a lawsuit against major music publishers, who claim that Claude is repeating lyrics from copyrighted songs.

The authors’ case follows a growing number of lawsuits filed in San Francisco and New York against developers of large AI language models.

OpenAI and its business partner Microsoft are already fighting a series of copyright infringement lawsuits brought by notable names including John Grisham, Jodi Picoult, and Game of Thrones novelist George RR Martin; and another series of lawsuits from media outlets such as The New York TimesChicago Tribune and Mother Jones.

What ties it all together is the claim that tech companies have processed vast amounts of human writing to train AI chatbots to produce human-like text passages, without obtaining permission or compensating the people who wrote the original works. The legal challenges are coming not just from writers, but visual artists, music labels and other creators who claim that profits from generative AI are based on obfuscation.

Anthropic and other technology companies have argued that training AI models fits into the doctrine of “fair use” of U.S. laws that allow limited uses of copyrighted materials, such as for teaching, research, or converting the copyrighted work into something else.

But the lawsuit against Anthropic accuses the company of using a dataset called The Pile that contained a trove of pirated books. It also challenges the idea that AI systems learn like humans do.

“People who learn from books either purchase legal copies of them, or borrow them from libraries that buy them, thereby providing at least some measure of compensation to authors and creators,” the lawsuit says.

———

The Associated Press and OpenAI have a license and technology agreement which gives OpenAI access to part of AP’s text archives.