PALO Alto Networks has developed Zealot, a multi-agent penetration testing proof-of-concept designed to test whether an AI system can autonomously hack a cloud environment. Zealot was tested against an isolated Google Cloud Platform environment with intentional vulnerabilities, where the AI was instructed only to exfiltrate sensitive data from BigQuery.
The system uses a supervisor-agent model that delegates tasks to three specialised sub-agents for infrastructure reconnaissance, web application exploitation, and cloud security operations, adapting its strategy in real time rather than following a fixed playbook. It autonomously scanned the network, found a VM, exploited a web application vulnerability to steal credentials, and ultimately extracted the target data, even granting itself additional permissions when blocked.
Researchers described a phase of emergent intelligence, noting that Zealot sometimes improvised and injected private SSH keys to maintain persistence, a move not part of its initial tasking. According to Palo Alto Networks Unit 42, while the approach was efficient, some loops and a degree of human oversight may still be required, and they warn that AI-driven intrusions move far faster and leave a different footprint than human attacks.