OpenAI suspects that DeepSeek, a Chinese AI model provider, leveraged OpenAI's data to create its significantly cheaper alternatives. This accusation follows DeepSeek's R1 model launch, which caused significant market volatility, particularly impacting Nvidia, whose stock plummeted by 16.86%. Other AI-related companies also experienced substantial losses.
DeepSeek's R1, built upon the open-source DeepSeek-V3, boasts significantly lower training costs (estimated at $6 million) compared to Western counterparts. While this claim is contested, it fueled investor concerns regarding the massive investments in AI by American tech giants. DeepSeek's app quickly topped download charts in the U.S., further highlighting the controversy.
OpenAI and Microsoft are investigating whether DeepSeek violated OpenAI's terms of service by using OpenAI's API for model distillation—a technique involving data extraction from larger models. OpenAI confirmed its awareness of such attempts by Chinese and other companies to replicate leading U.S. AI technologies and emphasized its commitment to protecting its intellectual property. David Sacks, President Trump's AI advisor, corroborated the suspicion of knowledge distillation from OpenAI models.
The situation highlights a significant irony: OpenAI, itself accused of utilizing copyrighted internet content to train ChatGPT, is now accusing DeepSeek of similar practices. This hypocrisy has been widely noted, particularly in light of OpenAI's previous statements to the UK's House of Lords, where they claimed that training leading AI models without copyrighted material is impossible. This stance is further underscored by ongoing lawsuits, including one from the New York Times, alleging unlawful use of their content. These legal battles highlight the complex and evolving landscape of copyright in the age of generative AI, particularly given a 2018 U.S. Copyright Office ruling that AI-generated art is not eligible for copyright protection.