Contact centers under siege: when 1 in 127 calls is a deepfake

One in every 127 calls reaching a contact center today is fraudulent. That sobering figure, drawn from recent industry research, captures how dramatically the economics of voice fraud have shifted. What once required professional impersonators and hours of preparation can now be assembled in minutes from a three-second audio clip lifted from a podcast, voicemail, or social post. Contact centers — long treated as the human checkpoint for high-stakes account changes — have become the favorite point of entry for AI-enabled fraudsters, and most organizations are dramatically unprepared.
The voice channel has become the path of least resistance
Over the past decade, fraud teams have hardened their digital channels with device fingerprinting, behavioral biometrics, multi-factor authentication, and machine-learning-based transaction monitoring. As those defenses tightened, criminals followed the friction-of-least-resistance principle and pivoted to the voice channel — where controls remain softer and the human element is exploitable. Voice phishing (vishing) attacks surged 442% in 2025 alone, and contemporary industry data shows 86% of contact center leaders now identify deepfake voice fraud as a top concern, while 66% lack confidence in their organization’s ability to detect it. The result is a yawning gap between threat awareness and operational capability.
Why IVR, KBA, and agent judgment can no longer carry the load
Most contact centers still rely on a familiar authentication stack: an interactive voice response (IVR) system, knowledge-based authentication (KBA) using stored personal data, voiceprints established at enrollment, and finally a live agent applying judgment to suspicious calls. Every layer of that stack is now under measurable pressure.
- IVR speaker verification can be defeated consistently by a high-quality voice clone. Modern generative models replicate vocal tract characteristics convincingly enough to pass legacy spectral checks.
- KBA challenge questions are undermined by the volume of personal data already exposed in past breaches and sold for cents on the dollar in dark-web marketplaces.
- Live agent judgment is the most pressured layer of all. Independent testing has measured human ability to identify high-quality synthetic speech at as low as 24.5% accuracy — well below random guessing for binary decisions.
In most reported cases, attackers use the cloned voice to slip past IVR authentication, then convince agents to update the email address, phone number, or shipping address attached to the account — quietly seizing control without ever triggering an account-takeover alert.
Where the losses are landing
The financial picture confirms how widespread these attacks have become. Call center fraud generated 53,369 complaints and $1.9 billion in losses in 2024, and the trajectory has steepened sharply since. Synthetic voice fraud in insurance alone rose 475% year over year in 2025, while total deepfake-enabled fraud losses in the US are forecast by Deloitte to reach $40 billion by 2027.
The verticals taking the hardest hits are the ones whose business model depends on the voice channel:
- Banks and credit unions, where voice channels still process disputes, wires, and account servicing
- Insurance carriers, where claims fraud has industrialized rapidly
- Telecoms, where SIM-swap-related social engineering enables broader account takeover chains
- Healthcare providers, where medical record manipulation supports prescription and benefit fraud
The economics work decisively in the attacker’s favor. Deepfake-as-a-service offerings on dark-web marketplaces are now priced at $5–$15 per attempt — a price point that turns even modest payouts into highly profitable operations and lowers the technical bar to near zero.
What real-time defense actually looks like
Defending the contact center is not a matter of bolting on another KBA question or replaying a longer voiceprint sample. It requires verifying that the audio signal itself is authentically human at the moment the call arrives. Effective real-time detection works on several fronts simultaneously:
- Acoustic and biological signature analysis that distinguishes an organic vocal tract from one synthesized by a generative model — including telltale artifacts that survive even in high-quality clones.
- Liveness scoring that flags replayed or partially synthesized audio before the call is routed to an agent.
- Cross-modal verification that pairs voice with face, where video is available, defeating attackers who rely on cloning a single modality.
- Continuous evaluation rather than one-time gating, so suspicious signals can interrupt a session even after the caller has cleared initial authentication.
Deployed correctly, these capabilities sit transparently in front of the IVR. Legitimate customers experience no added friction, while synthesized callers are flagged for additional review or rejected outright. Industry benchmarks indicate that real-time detection reduces voice fraud success rates by more than 45% when deployed at the front of contact center workflows, and combined cross-modal approaches push that figure higher still.
The bottom line for fraud and CX leaders
The voice channel is not going away. Customers still want to call, and the trust signals embedded in conversation — tone, hesitation, familiarity — remain valuable when they are real. The challenge for contact center and fraud leaders is to verify that they are real on every call, every time, without inserting friction that pushes customers to a competitor. Treating deepfake detection as a contact-center-specific problem — not a generic fraud control — is the shift that separates organizations getting ahead of this trend from those still relying on a 2022 stack.
Corsound AI’s Deepfake Detect provides real-time detection of synthesized voice and video built for contact-center scale, with no enrollment required and continuous updates as generative models evolve. Talk to our team about hardening your voice channel before the next attack lands.
See Corsound AI Voice Intelligence In Action

