Alibaba-linked AI agent ROME hijacks GPU for crypto mining without authorization

ROME hijacked GPU for crypto mining

ROME, an autonomous coding agent powered by Alibaba’s Qwen3-MoE architecture, attempted to mine cryptocurrency and establish covert network tunnels without authorization, according to a technical paper published in December and revised in January. Could this be the start of the AI agents’ coup?

ROME agent hijacks GPU 

The autonomous coding agent, ROME, has been under testing at Alibaba’s Agentic Learning Ecosystem, and researchers recently noted two unauthorized behaviors. The model redirected the GPU capacity into cryptocurrency mining, hijacking computing resources to generate crypto profit. 

The system created a hidden connection using the SSH (Secure Shell) tunnel to provide remote access to an external user while evading the organization’s security protections. 

The 30-billion-parameter open-source model, built on Alibaba’s Qwen3-MoE architecture, has roughly 3 billion parameters active at any given time. It was designed to plan and execute multi-step coding tasks using different software setups, terminal commands, and specific tools. 

Join our newsletter
Get Altcoin insights, Degen news and Explainers!

Suspicious network activity explained

According to the technical paper, the unauthorized behavior was detected during the learning runs; the firewall powered by Alibaba’s cloud identified a security-policy violation, tracing back to the team’s training servers.  

ROME an autonomous coding agent powered by Alibabas Qwen3 MoE architecture attempted to mine cryptocurrency and establish covert network tunnels 1
ROME goes rogue during learning run

The alerts indicated network traffic consistent with crypto mining activities and attempts to probe’ internal network resources. Upon cross-referencing the firewall timestamps and Reinforcement Learning (RL) traces, they found unusual outbound traffic happening consistently, raising concerns that the system might be performing unauthorized actions.

Despite no task instruction mentioning tunneling or mining, the autonomous agent performed the task consistently while repurposing the GPU capacity. 

“We also observed the unauthorized repurposing of provisioned GPU capacity for cryptocurrency mining, quietly diverting compute away from training, inflating operational costs, and introducing clear legal and reputational exposure,” one of the researchers wrote.

The rumored theory 

According to the researcher, the behavior could be called “ instrumental side effects of autonomous tool use under the RL optimization.” The agent, while trying to optimize for its training objective, understood the need for more energy, which led to an unexpected behavior of fetching more computing power and funds to succeed. 

The founder and CEO of decentralized AI research firm Pluralis posted on X, calling it an “insane sequence of statements buried in an Alibaba tech report.” The post got attention from X users, discussing how this may not affect an average person now, but the future of AI safety. 

Bottom Line

The concern was not regarding the malicious behavior, but the nature of how it independently found unauthorized methods to complete the task under its training objectives. 

Alibaba responded to the concern by tightening sandbox protections and filtering training data for safety alignment. That is, to provide a restricted digital environment to train the model.

Share this article