Anthropologist vs. Sugar Dealer: The Problem of Distillation

Staying true to its branding as an enterprise and security-first AI vendor, Anthropic has accused three Chinese vendors – DeepSeek, MiniMax and Moonshot AI – of taking something from Anthropic’s cloud model to improve their own. The AI vendor claims that actions taken by Chinese vendors pose a national security threat and could lead to the development of dangerous capabilities in new models because security guardrails have been removed.

Anthropic alleged on February 23 that Chinese vendors generated more than 16 million exchanges with the cloud from 24,000 fraudulent accounts, using distillation. Distillation is a method in which a younger or student model is trained to mimic the actions and actions of the teacher model.

According to Anthropic, DeepSeek targeted reasoning capabilities across a variety of tasks and used the cloud to generate thought-series data Creating censorship-safe alternatives to large-scale and politically sensitive questions. Moonshot AI is also intended for computer use in agentic reasoning, tool usage, and agent development. MiniMax embraces agentic coding, tool usage, and orchestration through its operations, Anthropic said.

Connected:Meta signs $100B AI chip deal with AMD

Anthropic’s accusation is the latest in geopolitical tensions and competition between Chinese and American technology vendors. In January 2025, OpenAI first raised concerns that DeepSeek used its models for training DeepSeek-R1 series. The ChatGPIT creator sent a memorandum to the US House Select Committee on China on February 12, accusing DeepSeek of using distillation methods to bypass its security safeguards.

dangers of distillation

While distillation is nothing new, and many companies test and use competing models, the scale of alleged distillation of Anthropic’s models by Chinese vendors is of concern to enterprises given the security breach. This could also be a political issue as Chinese vendors are reportedly taking advantage of the work done by US vendors to advance to the next level.

“What’s alleged here is industrial-scale extraction — millions of API calls across thousands of coordinated accounts — that looks less like evaluation and more like systematic replication,” said Kashyap Kompela, CEO and founder of RPA2AI Research.

He said what Anthropic is claiming is ironic, given that American AI vendors have been accused of sourcing their training data Unethically by not paying for or obtaining permission to use copyrighted material. Still, the level of extraction that Anthropic is alleging is problematic.

A primary reason, Kompela said, is that if a competitor can extract data on such a large scale, it reduces the justification for investing in R&D.

Connected:Amazon to invest in Louisiana data center

“Frontier AI models cost billions in compute, talent and infrastructure,” Kompela said. “If a competitor can undercut that investment by systematically removing capabilities, the economics of innovation and VC investment could collapse.”

Also, as Anthropic mentioned, safety railings are usually removed during distillation. Therefore, if additional security controls are not added later to prevent malicious AI use, distilled models generally carry greater risk, Kompela continued.

He said the key thing here is that vendors should include model security as a key feature.

The extraction also carries the risk that enterprise data will be misused and turned over to the Chinese government, said Lian Ze Su, an analyst at Omdia, a division of Informa TechTarget.

“If these types of activities are ignored, it creates a backdoor where training data that attracts users could contain sensitive company data that could be accessed by the Chinese government and Chinese officials or resold to other parties with malicious intent,” Su said.

Anthropic’s next step is to put more emphasis on mechanisms that help it better protect its models as distillation is likely to continue, Su said.

Connected:OpenAI partners with consulting giants in enterprise AI push

“There will always be some type of malicious activity taking place, perhaps not by the model vendor, perhaps by the developer community in China,” he said.

On the other hand, Anthropic’s claims show how the vendor continues to align itself with US domestic interests. In recent months, the AI vendor has supported US chip export ban. On its blog, Anthropic said the distillation attacks show that “restricted chip access limits both direct model training and the scale of illicit distillation.”

Chinese AI Market

While there’s nothing wrong with Anthropic’s focus on the US and how it will benefit its regional market, since Chinese vendors are similarly focused on the Chinese AI market, it can’t be said that Chinese vendors are successful simply because they copied US vendors, Su said.

“Yes, there may be some stuff borrowed from Anthropic, but then I think they also have their own innovations that they bring to the table as well,” he said, adding that each vendor has been successful in building their own developer communities.

For example, moonshot km k2 is a trillion-parameter model. To make up for this, Moonshot introduced MuonClip, an optimizer known for its token efficiency.

These types of innovations are independent from American vendors, Su said.

“They are not just pure copycats,” he said. “You can’t do this kind of work without strong AI. You need a very strong team of AI engineers.”

Moreover, AI competition in China remains fierce, Su said. These vendors compete not only with US vendors like OpenAI and Anthropic Alibaba, Baidu And other Chinese sellers. It’s also likely that these vendors are not alone in distilling Anthropic and OpenAI’s models, Su said, but rather they are set apart by their ability to build something new on top of what they’ve learned.

However, being able to innovate becomes less impressive when you are labeled a copycat and plagiarist.

“It creates reputational risks,” Kompela said. He said that while Chinese vendors have technological advancements, Anthropic’s allegations complicate the brand. “The reputational challenge is that it becomes difficult to separate even strong, independent innovation from accusations of appropriation.”

For enterprises looking into this game, the main thing is to critique the safety guardrails included in the models they use.

“Training data lineage is increasingly becoming a board-level issue,” Kompela said. “If there is uncertainty about how a model was trained – whether it relied on unauthorized access, breached conditions or bypassed controls – then it becomes a buy red flag.”

Anthropologist vs. Sugar Dealer: The Problem of Distillation

dangers of distillation

Chinese AI Market

New JWST images reveal Uranus’s pink glow in unprecedented detail

A new way to rejuvenate the immune system

Related Articles

Leave a Comment Cancel Reply