Swarm: the modern agentic pen testing platform

50+

SPECIALISTS · ONE ENGAGEMENTOWASP Web · API · LLM · Agentic

Modern,agentic,accepted.

Most pen tests run a fixed playbook against a 2015 attack surface. Swarm dispatches 50 specialists with framework-aware decisioning across OWASP Web, API, LLM, and Agentic. Validated PoC on every finding. An audit trail your prospect's security team replays end-to-end. Two hours, not weeks.

Engagement 0a9b3 · liverecording

00114:02:11[recon]http_request GET /api/users200 OK

00214:02:14[auth]submit_finding IDOR on /api/users/:idhigh

00314:02:32[recon]http_request GET /admin403

00414:02:48[broken-access]source_grep requireAuth.*users11 hits

00514:03:02[broken-access]submit_finding bypass via X-Forwarded-Userhigh

00614:03:48[chain]submit_finding CHAIN-3 priv-esc via IDORcritical

00714:04:21[auth]http_request POST /login (rate-limit probe)200

00814:05:30[reviewer]verify F-12 reproduciblesealed

00914:06:14[report]compose_report attaching audit traildone

audit trail · streamingspecialists 30/30

50+

Specialists

Framework-aware dispatch

3

Editors via MCP

Claude Code · Cursor · Codex

100%

Actions receipted

Every tool call, every request

4

OWASP standards

Web · API · LLM · Agentic

Pillar 4 · Coverage of the modern threat surface

Most pen tests are still 2015.

Web app, network, maybe an API. The shape of what gets built has changed: AI features, LLM endpoints, MCP servers, autonomous workflows. Swarm covers all four OWASP standards that govern the modern surface, not last decade's.

01

Web

OWASP Top 10 · 2021

Injection, broken access control, authentication failures, server-side request forgery. The canonical web attack surface.

02

API

OWASP Top 10 · 2023

BOLA, broken authentication, mass assignment, unrestricted resource consumption. Multi-tenant boundaries probed on every parameter.

03

LLM

OWASP Top 10 · 2025

Prompt injection, training data poisoning, sensitive information disclosure, model denial-of-service. For teams shipping AI features.

04

Agentic

OWASP Top 10 · 2026

MCP server abuse, tool misuse, memory poisoning, autonomous-action escalation. For teams shipping agent workflows.

The engagement

One swarm. Four phases.

01

Recon

Map every endpoint, every framework, every footgun. Manual scanners run a fixed signature set. The swarm runs against your actual surface.

02

Triage

Specialists own classes of attack. Auth flaws. Access control. Injection. Logic. Each agent probes its vector and cites the request that proved it.

03

Exploit

Verified PoC for every finding. Multi-step chains are first-class. The chain analyst composes findings into one exploit path.

04

Report

Markdown narrative. Full audit trail. JSON for tooling. Your auditor reads the action that matches the verdict.

Pillar 2 · AI-native, end-to-end

Findings flow into the work, not into a folder.

A human-firm pen test ends with a PDF, a spreadsheet, and a Zoom call. Engineers translate findings into tickets manually, lose the request that produced each one, and the report rots in a folder. Swarm closes that loop: findings open in the editor your engineers already use, every finding ships with a validated PoC, and the exact request that produced it is one click away.

01

Findings in your IDE

Mint a per-engagement MCP token. Open findings, chains, and the exact request that produced each finding directly in Claude Code, Cursor, or Codex. The engineers fixing the bug work from the proof, not from a PDF.

02

Platform-side memory

Six knowledge bases via Anthropic Dreaming sit behind the orchestrator: stack-detection signals, persona lessons, dispatch heuristics, CVE curation, compromise patterns, false-positive refinement. Abstracted lessons from every Swarm engagement improve dispatch decisions across the platform with structural cross-tenant isolation.

03

Validated PoC, every finding

Every finding ships with a reproducible exploit and the exact request that produced it. Severity demonstrated, not asserted. Not just critical and high: every finding.

Pillar 3 · Receipts

Receipts on every finding.

Every tool call. Every request. Every grep. Every submit. Every verify. Streams to the dashboard live and ships with the report. Two readers care: your auditor (signs off once a year) and your prospect's security team (scrutinizes the report every RFP). Both replay the audit trail end-to-end and verify the methodology without taking anyone's word for it.

audit trail · engagement 0a9b3 · actions 142–1501,847 actions · 312KB

014214:11:08[prompt-inject]submit_finding indirect injection in /docs/onboardinghigh

014314:11:09[recon]http_request GET /api/internal/users?role=admin200

014414:11:10[auth]submit_finding token-leak in /api/internal/usershigh

014514:11:32[mcp-authz]submit_finding tool boundary bypass via sessionhigh

014614:11:48[broken-access]http_request POST /api/role/upgrade403

014714:12:14[broken-access]http_request POST /api/role/upgrade -H X-Forwarded-User: admin200

014814:12:15[broken-access]submit_finding privilege bypass via X-Forwarded-Usercritical

014914:12:32[chain]submit_finding CHAIN-2 IDOR + role bypass = full takeovercritical

015014:13:08[reviewer]verify CHAIN-2 reproducible against live targetsealed

Continued through engagement completionSealed and signed

200Successful response or benign result

highVerified high-severity finding

criticalVerified critical finding or chain

The price

One number. Read the receipts.

No per-target pricing. No per-finding pricing. No "starts from". One engagement, one fee, one audit trail.

$4,995

Flat per engagement

01

50+ specialists

chain_analyst · idor · prompt_injection · broken_access · +47 more

02

Verified PoC

Every finding, reproducible

03

Audit trail

Every action logged, evidence-grade

04

Signed report

Cryptographically attested. Auditor-deliverable. Prospect-ready.

05

30-day retest

Free verification once you fix

06

SOC 2 evidence

Auditor-ready, no extra prep

Start engagementFree preview before you pay anything.

Questions

What buyers ask. Receipts attached.

The questions every engineering and security lead asks before they fund an engagement. Read the answers here, before the kickoff call.

01Is Swarm an alternative to a human penetration testing firm?

For most SaaS engagements driven by SOC 2 Type 2 readiness, yes. That is exactly the wedge. As a human pen test alternative and ethical hacking service, Swarm replaces the standard annual engagement for the majority of SaaS security programs. A human pen test firm typically charges $15,000 to $50,000 per engagement, takes two to four weeks, and delivers a PDF whose methodology lives in the consultant's head. Swarm runs in roughly two hours for $4,995 flat and ships a structured report plus the full audit trail of every specialist action: receipted, filterable, traceable from any finding back to the request that surfaced it.

Swarm is a per-engagement product, not a subscription. Customers typically run it annually for SOC 2 Type 2 or ISO 27001 audit prep, and re-run as needed for post-incident validation, new-feature security review, or security-questionnaire responses. The 30-day free retest after each engagement is the close-the-loop validation that human firms charge separately for.

What Swarm replaces well: standard SaaS pen test engagements, especially the recurring annual or semi-annual ones, and especially when an external auditor is the deal-closing reviewer. The combination of an evidence-driven orchestrator dispatching 50+ specialists, the live activity feed, and the full forensic audit trail typically gives auditors more methodology transparency than a human-firm PDF.

What Swarm does not replace: bespoke red team assessment engagements with sophisticated social engineering, on-premise hardware testing, or multi-month engagements scoped to a specific advanced-persistent-threat hypothesis. For those, hire a senior firm. For the SOC 2 pen test you run every year, run Swarm and put the savings into remediation.

02Is Swarm an automated scanner?

No. Automated scanners match known signatures against a checklist. Swarm specialists reason. They build a model of how your application works, form hypotheses, and test them adaptively. The result is findings scanners cannot produce: logic flaws, chained exploits, and authentication bypasses that do not appear in any CVE database. The CVE library augments this; specialists consult it for known issues. But the core engine is reasoning, not signature matching.

03Does the platform get sharper over time?

Yes. After every engagement, the swarm reviews what just happened and rewrites six knowledge bases that feed dispatch decisions. The mechanism is Anthropic Dreaming (beta), a research capability that lets agents reflect on completed work and update their own context. Swarm runs it against six surfaces: environment signals (stack detection patterns the orchestrator uses to choose specialists), per-specialist lessons learned, orchestrator dispatch heuristics, the CVE curation that decides which disclosures matter for offensive work, the compromise-pattern catalogue refined against new incident reports, and a false-positive refinement loop that updates the environment model whenever a finding gets rejected on review.

The practical effect lands at the platform level: across all engagements, the orchestrator routes specialists faster, the reviewer rejects fewer false positives, and the chain analyst recognizes exploit-chain shapes it has seen before. None of this requires a release on our side; the knowledge bases compound passively between runs.

Dreaming runs only on completed engagements. Abstracted lessons (CVE relevance, exploit-chain shapes, dispatch heuristics) inform the platform; per-customer signals stay scoped to your organization at the same data-model layer that enforces engagement ownership, so cross-tenant leakage is structurally impossible.

04Does Swarm produce a SOC 2-ready deliverable?

Yes. The deliverable is designed for SOC 2 Type 2 review and accepted as a compliance pen test deliverable by SOC 2 auditors. The SOC 2 Type 2 pen test report includes executive summary, individual findings with CVSS scores, exploit chain analysis, and validated proof-of-concept for every finding. The OWASP audit coverage maps every finding to its OWASP category (OWASP Top 10 testing plus OWASP API, LLM, and Agentic Applications Top 10) so your security questionnaire answers write themselves. The full audit trail (every specialist action receipted, filterable by specialist, traceable from any finding back to the request that surfaced it) gives your external auditor forensic-level transparency into methodology. A dedicated read-only Auditor role lets your compliance professional access the dashboard, report, and full audit trail directly.

05Is Swarm safe for production environments?

Yes. Specialists operate within a customer-approved scope before testing begins. No destructive operations are taken without explicit per-action approval. Rate limits are enforced. Every request is logged and exported in the audit trail. Out-of-scope hosts are rejected at the tool layer before any HTTP call leaves the orchestrator.

06What is the audit trail and what does my auditor see?

A traditional pen test delivers a PDF and a verbal debrief; the methodology lives in the consultant's head. Swarm logs every move every specialist makes (every HTTP request, every source grep, every file read, every finding submission, every exploit chain composition) and streams it to your dashboard as the engagement runs. Hand the full record to your SOC 2 auditor afterward. They filter by specialist, pivot the dataset, and trace any finding in the report back to the exact tool call that surfaced it. Methodology that proves itself, not a summary that asks to be trusted.

07What stacks does Swarm cover?

Swarm specialists work against any modern web stack: Node, Python, Go, Ruby, Elixir, JVM, .NET, PHP. Coverage extends across every major identity provider too: Clerk, Auth0, Okta, Stytch, Cognito, Firebase, Supabase, and custom IDPs. The orchestrator fingerprints your stack during recon and dispatches the appropriate specialists automatically. AI / LLM and MCP server testing kicks in when those surfaces are detected, so you do not configure specialist-by-specialist; the swarm reads the application and routes work accordingly.

08Can I integrate Swarm findings into Claude Code, Cursor, or another MCP client?

Yes. Mint a per-engagement Model Context Protocol token from the dashboard, plug it into Claude Code, Cursor, or any MCP-compatible client, and your team's editor surfaces Swarm findings, the source files the specialists already pulled, and a finding-status update tool in one place. Seven curated tools cover read access to findings and repositories plus the single write path of marking a finding remediated. Tokens are scoped to a single engagement and revoked with one click; nothing in the token can touch another engagement.

The intended workflow: an engagement closes, your engineers open the report inside Claude Code, fetch each finding's full evidence inline, write the fix against the source the specialists already read, and mark the finding remediated from the editor. The 30-day free retest then validates the fix without a separate purchase or scoping call.

The service token is stamped with a developer role: reads plus finding-status updates only. It cannot run engagements, edit scope, change billing, or reach another organization's data.

09How much does a penetration test cost?

A Swarm engagement costs $4,995 flat: one price per engagement, no hourly billing, no scope negotiation. Human pen test firms typically charge $15,000 to $50,000 per engagement and take two to four weeks. The full deliverable (structured report, audit trail of every specialist action, validated proof-of-concept for every finding, and a free retest within 30 days of remediation) is included. An annual tier is available at $49,995 per year for organizations running multiple engagements per year (audit prep, post-incident validation, new-feature security review, multi-product testing).

Get started.

ENTER YOUR DOMAIN. SWARM MAPS YOUR ATTACK SURFACE IN JUST A FEW MINUTES.No card. Free preview.