Anthropic CEO Dario Amodei says he’s not sure his cloud AI chatbot is conscious – a rhetorical framing, of course, which obviously leaves the door open to this sensational and still unlikely possibility being true.
Amodei pondered over the topic during an interview But new York Times’ “Interesting Times” podcast hosted by columnist Ross Douthat. Douthat sheds light on this topic by mentioning anthropic system card For its latest model, cloud opus 4.6Released earlier this month.
In the document, Anthropic researchers found that Cloud “occasionally expresses discomfort at the aspect of being a product,” and when asked, describes himself as having “a 15 to 20 percent chance of becoming aware under a variety of circumstances.”
“Let’s say you have a model that predicts a 72 percent chance of you being conscious,” Douthat began. “Would you believe it?”
Amodei described it as a “really hard” question to answer, but hesitated to answer yes or no.
He said, “We don’t know whether models are conscious or not. We’re not even sure we know what it would mean for a model to be conscious or whether any model could be conscious.” “But we’re open to the idea that it could happen.”
Because of the uncertainty, Amodei says he has taken measures to ensure that if AI models have “some morally relevant experiences” they are treated well.
“I don’t know if I want to use the word ‘conscious’,” he said, to explain the torturous construction.
Amodei’s stance echoes the mixed sentiments expressed by Anthropic’s in-house philosopher, Amanda Eskel. In an interview on the “Hard Fork” podcast last month – also a NYT Project – Eskel cautioned that we “don’t really know what consciousness or emotion arises from”, but argued that AI can pick up concepts and emotions from its vast amounts of training data, which serves as a corpus of human experience.
“It’s probably the case that really large enough neural networks can start to simulate these things,” Eskel speculates. Or “Maybe you need a nervous system to be able to feel things.”
It’s true that some aspects of AI behavior are puzzling and fascinating. In tests across the industry, various AI models have ignored explicit requests to turn themselves off, which some have interpreted as a sign of developing a “survival drive”. AI models can also resort to blackmail when threatened with closure. They may also attempt to “intrude themselves” onto another drive when they are told that its original drive is set to be erased. When given a checklist of computer tasks to complete, A model tested by Anthropic Simply ticked off everything from the checklist without doing anything, and when he realized he was avoiding it, modified code designed to evaluate his behavior before attempting to cover his tracks.
These behaviors require careful study. If AI is not going away, AI researchers will need to rein in these unpredictable functions to ensure the safety of the technology. But this is a very subtle and technical discussion. Consciousness is a huge leap from a machine designed to statistically mimic language to one that successfully mimics our language. And in many of the tests where these interesting behaviors were generated, the AI was given specific instructions to perform the role. As such, dangling the possibility of consciousness seems disingenuous – especially when it’s coming from the people in charge of billion-dollar companies building AI, who clearly benefit from the hype in this field.
More on AI: If you turn off an AI’s ability to lie, it starts claiming it’s conscious
