Inside p1bot: A Vishing Platform Weaponizing ElevenLabs

An analysis of p1bot.io, a subscription vishing-as-a-service platform with ElevenLabs text-to-speech hardcoded directly into its attack workflow.

March 11, 2026

Ross Lazerowitz

Co-Founder and CEO

Inside p1bot: A Vishing Platform Weaponizing ElevenLabs

Commercial AI Has a Vishing Problem

The threat intelligence community has been sounding the alarm on AI-powered social engineering for over a year. OpenAI's quarterly disruption reports have documented threat actors using LLMs to craft phishing lures, generate fake resumes, and scale influence operations. Google's Mandiant team published research in 2024 showing how AI-powered voice spoofing has been incorporated into red team operations, demonstrating just how convincing synthetic voices have become. Academic researchers have even built proof-of-concept vishing bots using off-the-shelf APIs (OpenAI's GPT for conversation, ElevenLabs for voice synthesis, Twilio for telephony) and demonstrated them against human subjects.

But there's a gap between research papers and the real criminal ecosystem. What does it look like when these capabilities aren't in a lab, but packaged into a product that anyone can subscribe to for $399 a month?

We found out. While analyzing the client-side code and network traffic of p1bot.io, a vishing-as-a-service platform, we discovered a fully productized attack toolkit with ElevenLabs' text-to-speech API hardcoded directly into the platform. Twenty-three ElevenLabs voice IDs ship with every account. Operators generate realistic IVR prompts in English, French, or Spanish, play them to victims mid-call, and capture whatever digits the victim presses (PINs, OTPs, account numbers) all in real time.

To our knowledge, this is one of the first documented instances of a commercial vishing platform embedding ElevenLabs as a core feature, moving beyond one-off misuse or research contexts into a subscription criminal service.

We reported our findings to the ElevenLabs security team, who responded promptly and collaborated with us to take action against the abusive accounts. Cyber defense is a team sport, and we thank them for their efforts.

The "Press 1" Scam: Why This Matters

Your phone rings. The caller ID shows your bank's number. An automated voice tells you there's been suspicious activity on your account and asks you to "press 1 to speak with a representative." You comply, and the moment you press that digit, you've handed control to a criminal sitting behind a dashboard, ready to extract your PIN, OTP, or account number one keystroke at a time.

These are "press 1" scams (sometimes called OTP bots or IVR fraud), and they've become one of the highest-volume voice phishing techniques in operation. The formula is effective because it's familiar: spoof a trusted caller ID, play a convincing automated prompt, and capture whatever digits the victim enters. The victim never speaks to a human. They think they're navigating a legitimate phone tree.

What makes this threat category especially dangerous now is the convergence of three trends: caller ID spoofing is trivial, AI-generated voices are indistinguishable from real ones, and platforms like p1bot have industrialized the entire workflow into a point-and-click SaaS product. No technical expertise or native voice speakers required. Just a crypto wallet and a Telegram account.

What p1bot Does

At its core, p1bot is a browser-based softphone designed for social engineering. It lets operators:

Spoof any caller ID when placing outbound calls
Generate realistic AI voice prompts using ElevenLabs TTS (23 voice options across English, French, and Spanish)
Place calls directly from the browser via WebRTC
Capture DTMF tones (the digits a victim presses) in real-time
Record calls for later review
Play pre-generated audio clips mid-call to simulate automated IVR systems

The platform targets victims in the US, Canada, and UK, and charges $399/month payable exclusively in Bitcoin, Litecoin, or Monero through OxaPay.

Inside the Dashboard

The p1bot interface is a dark-themed web application that puts the entire attack workflow within a few clicks.

The Softphone

p1bot Voice Calling interface showing the softphone with caller ID spoofing field, destination number field, and call controls

The Voice Calling page exposes the core of the operation: a "From (Caller ID)" field where operators enter any spoofed number, a "To" field for the victim's number, and a single "Call" button. The Call History tab logs every outbound call with status and duration. Note the "Events: Connected" indicator, which is the live Socket.io connection feeding captured DTMF digits back to the dashboard in real time.

AI Voice Generation

p1bot Generate TTS page showing ElevenLabs integration with voice selection and script text

The Generate TTS page is where operators craft their IVR lures using ElevenLabs' API. The default placeholder text, "Hello, thank you for calling. Press 1 to speak with a representative." - reveals the intended use case. Operators select from 23 voices (here, "Sarah - Soft, young American female"), generate a preview, and save clips to their library for use during live calls. This is commercial AI voice synthesis being directly weaponized for social engineering.

The Audio Library

p1bot Audio Library page showing file management for IVR playback

The Audio Library stores generated and uploaded audio clips for on-demand playback during live calls. During a call, operators trigger clips from this library (a fake verification prompt, a "please hold" message, a custom IVR menu) while monitoring the victim's DTMF responses in real time.

How Caller ID Spoofing Works

When an operator enters a spoofed number, the client injects four SIP headers simultaneously into the outbound INVITE:

X-Caller-ID: {spoofed_number}
P-Preferred-Identity: "{spoofed_number}" <sip:{spoofed_number}@{realm}>
Remote-Party-ID: "{spoofed_number}" <sip:{spoofed_number}@{realm}>;party=calling;privacy=off;screen=no
P-Asserted-Identity: "{spoofed_number}" <sip:{spoofed_number}@{realm}>

This shotgun approach ensures the spoofed identity propagates regardless of which header downstream carriers honor.

ElevenLabs Integration: AI Voices as Attack Infrastructure

The TTS feature is where p1bot crosses from a traditional vishing tool into something more concerning. The platform doesn't just support generic text-to-speech. It ships with a hardcoded catalog of ElevenLabs voice IDs: 15 English, 4 French, and 4 Spanish voices, each mapped to a real ElevenLabs voice profile. Users can also paste custom voice IDs, meaning anyone with an ElevenLabs account can extend the platform's voice capabilities.

The default placeholder text reveals the intended use case:

"Hello, thank you for calling. Press 1 to speak with a representative."

This is the classic IVR social engineering playbook, now upgraded with AI-generated voices that sound natural, warm, and trustworthy. The old robotic TTS that might tip off a savvy victim has been replaced with voices like "Sarah - Soft, young American female" that are designed to put people at ease.

The TTS workflow is straightforward:

Generate a clip with chosen voice and text
Save it to the Audio Library
During a live call, play clips on demand via the dashboard

We've seen open-source OTP bot kits on GitHub that wire together ElevenLabs and Twilio, and academic researchers demonstrated a fully automated AI vishing system (ViKing) using the same stack in a controlled study. But p1bot represents something different: a commercial, subscription-based platform where ElevenLabs is embedded as a first-class feature, not bolted on by individual operators. The integration is polished, the voice catalog is curated, and the workflow is designed to lower the barrier for anyone willing to pay.

After we shared our findings, the ElevenLabs security team moved quickly to investigate and take action. This kind of rapid, collaborative response between security researchers and AI vendors is critical, and it's exactly the model the industry needs as commercial AI tools increasingly appear in the criminal supply chain.

DTMF Capture

The platform captures victim keypad presses in real-time via Socket.io events (call:dtmf). Digits accumulate in a per-call map and display live in the dashboard. There's no automated IVR or call tree. The operator manually controls the call flow, playing audio clips and watching digits come in.

The DTMF send API supports precise timing control with beforeMs, betweenMs, durationMs, and afterMs parameters, useful for interacting with real automated systems on behalf of the operator.

The Tech Stack

Frontend

The dashboard is a Next.js 14.2.35 application using the App Router, built with:

React 18 with shadcn/ui components (completely unmodified default Slate theme)
Radix UI primitives, Tailwind CSS, Lucide icons
Zustand for state management, TanStack Query for server state
Sonner for toast notifications

Backend

Express.js origin server (confirmed via x-powered-by: Express header)
Prisma ORM with CUID v1 identifiers
Cookie-based authentication with silent token refresh
Socket.io for real-time events (call status, DTMF capture)
Sits behind Cloudflare (CDN, DNS, registrar, and likely Tunnel)

VoIP Infrastructure

Self-hosted Asterisk PBX running in a VM at 192.168.64.3:8088
The backend proxies WebSocket connections to Asterisk through localhost:4001/asterisk/ws
JsSIP library handles SIP registration and WebRTC call setup in the browser
Google STUN servers (stun.l.google.com:19302, stun1.l.google.com:19302) for NAT traversal
A console.log reference to an "Anveo bridge" strongly suggests Anveo Direct as the SIP trunk provider connecting to the PSTN

Payment and Registration

Registration: Via Telegram bot (@p1botstagebot in the staging environment we captured). Usernames follow the pattern u_{random} with @telegram.local email suffixes.
Pricing: $399/month (MONTHLY_399 plan)
Payment: Cryptocurrency only via OxaPay gateway: BTC (Bitcoin Mainnet), LTC (Litecoin Mainnet), XMR (Monero Mainnet)
Number provisioning: The platform offers phone number purchasing through its own API (/phone-numbers/available, /phone-numbers)

The 60+ Endpoint API Surface

The client-side code reveals a comprehensive REST API with over 60 endpoints spanning:

Auth: Login, logout, refresh, API key management
Calls: Create, list, delete, record, play audio, send DTMF, debug info
SIP: Credential provisioning, rotation, external trunk configuration
TTS: Voice listing, generation, preview, save
Audio: Upload (WAV/MP3/OGG/FLAC/AAC/M4A, 10MB max), library management, streaming
Membership: Checkout, payment status polling, recheck
P1 (Top-up): Summary, region selection, link requests, crypto checkout
Phone Numbers: Search available, purchase, list, delete
Wallet: Balance, transactions
Support: Ticket system with replies
Webhooks: CRUD with secret rotation
Organizations: Current org info and stats
Messages: Listing and retrieval

Infrastructure and OPSEC

The operator has taken some care to obscure the origin infrastructure:

Cloudflare everything: Domain registered through Cloudflare Registrar (2026-01-24), DNS through Cloudflare, CDN proxying all traffic. Likely using Cloudflare Tunnel, as no origin IP is exposed via Shodan or Censys.
No origin leak: All IPs in the HAR resolve to Cloudflare edge nodes. The Asterisk PBX sits on a private RFC 1918 address (192.168.64.3), unreachable from the internet.
Separate certificates: A Sectigo origin certificate (serial d83df698393d70fa78d5a6793131df2d) exists alongside the Cloudflare edge certificate (Google Trust Services), confirming a distinct origin server behind the proxy.

However, the client-side code leaks significant intelligence:

The full API surface and business logic
Internal infrastructure details (Asterisk IP, proxy port, SIP configuration)
Third-party service integrations (ElevenLabs voice IDs, OxaPay provider check, Anveo reference)
Debug logging that was never stripped for production

Vibecoded?

Strong indicators suggest the platform was built using AI-assisted coding (likely Claude via Cursor or a similar tool):

139 emoji-prefixed console.log statements throughout the codebase
132 box-drawing character debug blocks (a signature pattern of Claude-generated code)
Unmodified default shadcn/ui Slate theme where every CSS variable matches the defaults exactly
Overly verbose error messages that read like AI-generated diagnostic guides

The developer appears to have prompted an LLM to build the application and shipped it without cleaning up the debug output, which is precisely why we can extract so much intelligence from the client-side code. There's an irony here: AI was used to build the platform, AI powers its voice generation, and the AI-generated artifacts are what allowed us to tear it apart.

Intelligence Summary

Indicator	Value
Domain	p1bot.io / otp.p1bot.io
Registered	2026-01-24 via Cloudflare Registrar
Framework	Next.js 14.2.35, Express.js backend
PBX	Asterisk @ 192.168.64.3:8088
SIP Trunk	Likely Anveo Direct
TTS Provider	ElevenLabs (23 voices hardcoded)
Payment Gateway	OxaPay (BTC/LTC/XMR)
Pricing	$399/month
Registration	Telegram bot
Target Regions	US, Canada, UK
Origin Cert	Sectigo, serial d83df698393d70fa78d5a6793131df2d
Cloudflare Edge IPs	2606:4700:3032::ac43:8773, 2606:4700:3030::6815:6ea

The Bigger Picture: AI in the Criminal Supply Chain

p1bot isn't an anomaly. It's a signal. The same commercial AI services that enable legitimate products are being absorbed into criminal infrastructure. OpenAI's threat reports document LLMs being used to scale social engineering. Google's Mandiant team has demonstrated AI voice cloning in offensive operations. And now we're seeing ElevenLabs' voices integrated directly into a subscription vishing platform.

The pattern is consistent across every major AI vendor's threat reporting: attackers aren't developing novel AI capabilities. They're subscribing to the same commercial services as everyone else and bolting them onto established criminal playbooks. The barrier to entry drops with every new API.

This is why vendor collaboration matters. When we reported our findings to ElevenLabs, their security team responded quickly, investigated the abusive accounts, and took action. That kind of responsiveness, combined with the traceability features ElevenLabs has built into their platform, demonstrates that disruption is possible when researchers and vendors work together.

This research was conducted by the Mirage Security team. If you're interested in how your organization would fare against AI-powered vishing, smishing, and social engineering simulations, get in touch.