Anthropic refutes jailbreak claims on Claude Fable 5 AI model

CyberSIXT Evidence Panel Source marked as original reporting

ANTHROPIC has rejected claims of a prompt-based jailbreak affecting its Claude Fable 5 AI model, asserting the strength of its advanced safety systems and extensive testing. The model, launched with cybersecurity safeguards, defaults to a less capable version in sensitive fields to prevent misuse.

Despite early allegations from an individual named Pliny the Liberator, who claimed to have circumvented the protections, Anthropic stated that the outputs shared were either not from Fable 5 or lacked significant harmful potential. The company's defenses include independent classifier systems that are not easily bypassed. A review of usage found no evidence that the safety measures had been effectively compromised.

Article by CyberSIXT