Research shows companies are training AI models with YouTube content without permission

Artificial intelligence models need as much useful data as possible to function, but some of the biggest AI developers are partially relying on transcribed YouTube videos without the creators’ permission, in violation of YouTube’s own rules, as discovered in a research Through Proof News And Wired.

The two media outlets revealed that Apple, Nvidia, Anthropic and other major AI companies trained their models on a dataset called YouTube Subtitles, which includes transcripts of nearly 175,000 videos from 48,000 channels, all without the knowledge of the video creators.