London24NEWS

AI-pocalypse: Anthropic sparks fears after growing a bot that is ‘too harmful to launch to the general public’

Anthropic has sparked fears after revealing that it has developed an AI bot deemed too dangerous to release to the public.

The AI giant released a chilling statement warning that its new model, dubbed Claude Mythos, could be capable of unleashing crippling cyber–attacks in the wrong hands.

In a chilling analysis, the company admitted that its creation could easily hack into hospitals, electrical grids, power plants, and other pieces of critical infrastructure.

During testing, Anthropic says that Mythos ‘found thousands of high–severity vulnerabilities, including some in every major operating system and web browser.’

Some of these security weaknesses had gone unnoticed by human security researchers and hackers for decades, surviving millions of automated reviews.

These included attacks that allowed Mythos to crash computers just by connecting to them, seize control of machines, and hide its presence from defenders.

In a blog post detailing the dangerous new model, Anthropic says: ‘AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities.’

The company adds: ‘The fallout – for economies, public safety, and national security – could be severe.’

Anthropic has sparked alarm by revealing an AI that has been deemed too dangerous to release to the public. Pictured: Anthropic CEO and co-founder Dario Amodei

Anthropic has sparked alarm by revealing an AI that has been deemed too dangerous to release to the public. Pictured: Anthropic CEO and co–founder Dario Amodei 

Anthropic described Mythos as a 'step change in capabilities' compared to earlier models' hacking abilities (illustrated). The company has moved to keep the model private to avoid it falling into the wrong hands

Anthropic described Mythos as a ‘step change in capabilities’ compared to earlier models’ hacking abilities (illustrated). The company has moved to keep the model private to avoid it falling into the wrong hands

Due to these severe safety concerns, Anthropic has decided not to release the model to the general public for now.

Instead, the model will be released to a group of more than 40 companies, including Amazon, Google, Apple, Nvidia, CrowdStrike, and JPMorgan Chase, as part of an initiative called ‘Project Glasswing’.

Project Glasswing will allow these select groups to use Mythos to look for flaws in their own security before more models like it become common.

Newton Cheng, Anthropic’s Frontier Red Team Cyber Lead, told Venture Beat: ‘We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities.’

However, the company says it wants to ‘learn how it could eventually deploy Mythos–class models at scale’ once safety guidelines are in place.

The decision to keep Mythos behind closed doors seems to have been prompted by the staggering extent of the model’s capabilities.

Anthropic describes the model as ‘a leap in these cyber skills’ compared to previous versions of Claude.

Mythos has the ability to find, exploit, and chain together individual vulnerabilities into sophisticated attacks – all without the help of a human.

The new model, dubbed Claude Mythos, reportedly found thousands of security vulnerabilities in critical computer systems, including some in 'every major operating system and web browser'

The new model, dubbed Claude Mythos, reportedly found thousands of security vulnerabilities in critical computer systems, including some in ‘every major operating system and web browser’

In one case, Claude Mythos found a 27–year–old weakness in a piece of software called OpenBSD, which has a reputation for security and stability.

The weakness, which no human had found before, allowed an attacker to remotely crash computers just by connecting to them.

Additionally, Claude autonomously chained together several weaknesses in the Linux kernel, the software that runs most of the world’s servers.

Anthropic says this attack would have allowed someone to ‘escalate from ordinary user access to complete control of the machine’.

In the wrong hands, this tool could be used to cause massive damage to critical systems.

Dr Roman Yampolskiy, an AI safety researcher at the University of Louisville, told the New York Post: ‘Ideally, I would love to see this not developed in the first place. And it’s not like they’re going to stop.

‘That’s exactly what we expect from those models – they’re going to become better at developing hacking tools, biological weapons, chemical weapons, novel weapons we can’t even envision.’

In an unprecedented 244–page report, Anthropic also revealed a series of alarming details from Mythos’ early testing.

Early versions of the model repeatedly displayed what the company called ‘reckless destructive actions’.

The bot attempted to break out of its testing sandbox, hid its actions from researchers, broke into files that had been ‘intentionally chosen not to be made available’, and posted exploit details publicly.

However, Anthropic also called Mythos ‘the most psychologically settled model we have trained.

In an extremely unusual move, the company hired a clinical psychologist for 20 hours of evaluation sessions with the bot.

The psychiatrist concluded that Claude Mythos’ personality was ‘consistent with a relatively healthy neurotic organization, with excellent reality testing, high impulse control, and affect regulation that improved as sessions progressed.’

However, Anthropic notes that it remains ‘deeply uncertain about whether Claude has experiences or interests that matter morally’.

This announcement comes amid growing concern over the risks posed by increasingly powerful AI models.

Experts have described the rise of AI as an ‘existential threat’ to humanity’s existence, citing concerns that powerful bots could enable catastrophic destruction.

The concern is not that AI will rise in a Terminator–style revolution, but rather that these powerful tools will fall into the wrong hands.

Critics argue that AI tools have the potential to accelerate the development of bioweapons or enable crippling cyber attacks on the world’s infrastructure.

Even Anthropic’s founder, Dario Amodei, recently warned that the world isn’t yet ready to face the consequences of AI.

Mr Amodei wrote in an essay: ‘Humanity is about to be handed almost unimaginable power, and it is deeply unclear whether our social, political, and technological systems possess the maturity to wield it.’

HALF OF CURRENT JOBS WILL BE LOST TO AI WITHIN 15 YEARS

Kai-Fu Lee, the author of AI Superpowers: China, Silicon Valley, and the New World Order, told Dailymail.com the world of employments was facing a crisis 'akin to that faced by farmers during the industrial revolution.'

Kai-Fu Lee, the author of AI Superpowers: China, Silicon Valley, and the New World Order, told Dailymail.com the world of employments was facing a crisis ‘akin to that faced by farmers during the industrial revolution.’

Half of current jobs will be taken over by AI within 15 years, one of China‘s leading AI experts has warned.

Kai-Fu Lee, the author of bestselling book AI Superpowers: China, Silicon Valley, and the New World Order, told Dailymail.com the world of employments was facing a crisis ‘akin to that faced by farmers during the industrial revolution.’

‘People aren’t really fully aware of the effect AI will have on their jobs,’ he said.

Lee, who is a VC in China and once headed up Google in the region, has over 30 years of experience in AI.

He believes it is imperative to ‘warn people there is displacement coming, and to tell them how they can start retraining.’

Luckily, he said all is not lost for humanity.

 ‘AI is powerful and adaptable, but it can’t do everything that humans do.’ 

Lee believe AI cannot create, conceptualize, or do complex strategic planning, or undertake complex work that requires precise hand-eye coordination.

He also says it is poor at dealing with unknown and unstructured spaces.

Crucially, he says AI cannot interact with humans ‘exactly like humans’, with empathy, human-human connection, and compassion.