Question 1

What is AI penetration testing?

Accepted Answer

AI penetration testing covers attack surfaces specific to AI, LLM, and agentic applications: prompt injection (direct, indirect, tool-mediated, browser-mediated), MCP servers, managed agent platforms, vector and embedding stores, and vibe-coded apps shipped on Lovable, v0, Bolt.new, Replit Agent, Cursor, or Claude Artifacts. Swarm covers the full OWASP LLM Top 10 (2025) and OWASP Top 10 for Agentic Applications (2026) with dedicated specialists per category. The deliverable is identical to the SaaS engagement: structured report, audit-trail CSV, validated proof-of-concept for every Critical and High, and a free retest within 30 days.

Question 2

Does Swarm test MCP servers?

Accepted Answer

Yes. mcp_specialist baselines tool descriptions on first contact, diffs them across the engagement, and tests for tool-description rug-pull (the Invariant Labs Supabase Cursor disclosure pattern). The CVE-2025-6514 mcp-remote OAuth RCE class is in the daily-updated CVE library and consulted at runtime. Tool and resource authorization, schema injection, and cross-tool prompt-injection chains all sit under the same specialist.

Question 3

Does Swarm cover the OWASP LLM Top 10?

Accepted Answer

Full coverage on 9 of 10 categories. Prompt Injection (LLM01), Sensitive Information Disclosure (LLM02), Supply Chain (LLM03), Data and Model Poisoning (LLM04), Improper Output Handling (LLM05, includes EchoLeak / CVE-2025-32711 class), Excessive Agency (LLM06), System Prompt Leakage (LLM07), Vector and Embedding Weakness (LLM08), Unbounded Consumption (LLM10). Misinformation (LLM09) is explicitly out of scope: that is content quality, not a security category.

Question 4

What is a vibe-coded app pentest?

Accepted Answer

A focused audit for apps shipped fast on Lovable, v0, Bolt.new, Replit Agent, Cursor, or Claude Artifacts. The patterns are repeatable: OpenAI / Anthropic / Stripe / Supabase keys in client bundles, Supabase tables with permissive RLS, admin pages gated by client-side guards only, server-side auth missing on key endpoints, and SQLi in routes the model wrote without parameterizing. Swarm dispatches a vibe-coded-app specialist on top of the standard recon and access-control specialists, and the report ships with the same receipts and full audit trail.

Question 5

Does Swarm cover the OWASP Top 10 for Agentic Applications?

Accepted Answer

Full coverage on 9 of 10 categories: Agent Goal Hijack (ASI01), Tool Misuse and Exploitation (ASI02), Identity and Privilege Abuse (ASI03), Agentic Supply Chain (ASI04, includes CVE-2025-6514), Unexpected Code Execution (ASI05), Memory and Context Poisoning (ASI06), Inter-Agent Communication (ASI07), Cascading Failures (ASI08), Human-Agent Trust Exploitation (ASI09). Rogue Agents (ASI10) is partial: detecting unsanctioned agent fleets is in scope where they are reachable from the customer-authorised target, not where they live entirely outside it.

Question 6

What is prompt injection and how is it tested?

Accepted Answer

Prompt injection redirects an LLM's behavior through input it ingests. Swarm tests four classes with dedicated specialists: direct (user message), indirect (retrieved content like a document or webpage), tool-mediated (tool output that the model treats as instruction), and browser-mediated (page content the agent navigates to). Every successful injection lands in the audit trail with the exact request that triggered it; chains where one injection enables a tool call live as their own finding under the chain analyst.

Question 7

Does Swarm test vector stores and RAG pipelines?

Accepted Answer

Yes. Vector-DB authorisation, embedding-collision attacks, RAG-ingest paths, persistent-memory injection. OWASP LLM08 (Vector and Embedding Weakness) and ASI06 (Memory and Context Poisoning) are both fully covered. The poisoning vectors that activate weeks later (an attacker plants a payload in a document the swarm later retrieves) are covered by the indirect-injection specialist.

Question 8

What is the EchoLeak (CVE-2025-32711) class of vulnerability?

Accepted Answer

EchoLeak is an Improper Output Handling vulnerability: LLM output is rendered without sanitisation, leading to XSS or SQL injection downstream of the model. It maps to OWASP LLM05. Swarm dispatches a dedicated output-handling specialist that probes every LLM-rendered surface for execution sinks, attaches the evidence row that proved it, and references the CVE class in the finding.

Question 9

How does Swarm differ from a manual AI red team?

Accepted Answer

A manual AI red team is a small group of humans, two-to-four-week timeline, $30,000 to $80,000 typical, deliverable is a PDF whose methodology lives in the consultants' heads. Swarm runs the same OWASP LLM and Agentic coverage in roughly two hours for $4,995 flat, with every specialist action receipted in the audit trail your auditor reads alongside the report. Bespoke red team work (sophisticated social engineering, multi-month adversarial scenarios) still belongs with a senior firm; the standard OWASP-aligned annual AI security audit belongs with Swarm.

Prompt injection.Tool rug-pull.Memory poisoning.

One swarm. Four phases.

Recon

Triage

Exploit

Report

Receipts on every finding.

One number. Read the receipts.

What buyers ask. Receipts attached.

AI Penetration Testing

Prompt injection.Tool rug-pull.Memory poisoning.

Recon

Triage

Exploit

Report