Scroll to top
Deepfake and Impersonation Fraud Security for Executives | CloseProtectionHire

Security Intelligence

Deepfake and Impersonation Fraud Security for Executives | CloseProtectionHire

Deepfake audio and video are now being used to defraud companies and impersonate executives. Security professionals explain the threat and the protective measures available.

7 May 2026

Written by James Whitfield, Senior Security Consultant

Deepfake and Impersonation Fraud Security for Executives

In February 2024, a finance officer at a multinational firm in Hong Kong transferred HKD 200 million – approximately USD 25 million – after participating in what appeared to be a video conference with the company’s CFO and several senior colleagues. All of them were deepfakes. The officer had initially been suspicious of a phishing email but was reassured by the video call. He made the transfer. By the time the fraud was confirmed, the money was gone.

This was not an outlier. Deepfake fraud has been documented since 2019, when a UK energy company CEO transferred EUR 220,000 after receiving what he believed was a phone call from his German parent company’s chief executive. The voice was an AI clone. The fraud took minutes.

The technology is not exotic. Voice cloning tools capable of producing convincing output from sixty seconds of audio sample are commercially available and, in some cases, free to use. Video deepfake quality has improved dramatically with diffusion model technology. The barrier to conducting a deepfake fraud is no longer technical capability – it is the targeting intelligence to make the deception convincing.

This article addresses the threat model, the technical and procedural defences available, and the executive protection implications of a threat that is growing in frequency and sophistication.

How the Fraud Works

Deepfake fraud against companies follows a consistent methodology. The first phase is reconnaissance. The attacker identifies a target company with sufficient public information about its leadership team – company website, LinkedIn, Companies House filings, press coverage – to make an impersonation convincing. They identify the target employee: typically a finance officer with payment authority, someone senior enough to authorise a transfer but junior enough that they might not question a direct instruction from the CEO or CFO.

The second phase is material acquisition. Voice cloning requires audio samples of the person being impersonated. For a senior executive who has spoken at conferences, participated in earnings calls, or appeared in video interviews, this material is often freely available in quantity. Sixty to ninety seconds is a workable minimum for current voice cloning tools; several minutes of training audio produces substantially better output. Video deepfakes require a larger image and video dataset but the same principle applies: public appearances are the source.

The third phase is the attack itself. The fraudster contacts the target employee by phone or video call, using the cloned voice and, in more sophisticated cases, real-time video synthesis to present an image of the impersonated executive. The communication is designed to be urgent, confidential, and outside normal process – “we have a confidential acquisition, this must be processed today, do not discuss with others.” The social engineering pressure is as important as the technical deception.

The fourth phase, if the fraud succeeds, is rapid movement of the transferred funds through multiple accounts and jurisdictions to obstruct recovery. The window between transfer and detection is typically several hours to a day; during this time, banks can sometimes execute a recall. Once the money has been moved through a cryptocurrency exchange or a jurisdiction with limited enforcement cooperation, recovery is near-impossible.

The Scale of the Problem

The FBI’s Internet Crime Complaint Center (IC3) Annual Report for 2023 recorded USD 2.9 billion in business email compromise (BEC) losses in the United States, making it the highest-value cybercrime category by financial loss. BEC includes traditional email fraud and is increasingly including voice and video impersonation. The FCA has issued specific warnings to UK regulated firms about voice cloning attacks on authentication systems.

The NCSC’s 2024 Annual Review cited AI-enabled fraud – including voice synthesis and deepfake media – as a growing threat category. Europol’s 2024 Internet Organised Crime Threat Assessment (IOCTA) identified deepfake fraud as an emerging priority.

These are not figures for rare, highly sophisticated attacks on large corporations only. Voice cloning tools that produce convincing audio are available as commercial services with no meaningful access controls. The attack requires social engineering capability and targeting research – skills that criminal groups already possess from years of BEC and vishing operations. Deepfake technology adds a layer of credibility to the attack, not a qualitative change in criminal method.

Why Executives Are the Primary Target

The deepfake fraud targets executives for two reasons. First, impersonating a senior executive creates authority that overrides normal financial controls. An instruction that appears to come from the CEO or CFO carries a presumption of legitimacy that a junior employee will typically not challenge – particularly when the instruction is framed as urgent and confidential. Second, senior executives have the highest volume of public audio and video available for training.

An executive who has spoken at industry conferences, given media interviews, participated in recorded earnings calls, appeared on company video content, or joined podcasts has created a training dataset entirely in the public domain. Security awareness of this dynamic does not require executives to stop all public communication. It does require that the communications profile be factored into the threat assessment, and that specific channels – those used for financial authorisations, board decisions, and HR matters – have robust verification controls that do not rely solely on voice or face recognition.

LinkedIn profiles and company website biographies also contribute. They tell the attacker the names of colleagues, reporting lines, and organisational context – information that makes the impersonation conversation more convincing because the attacker can reference known relationships and ongoing business matters.

Technical Defences

The primary technical defence against deepfake audio is liveness detection. Standard voice authentication systems that match against a stored voiceprint are vulnerable to a good voice clone. Advanced systems use dynamic challenge-response: they ask the user to repeat a random phrase they could not have pre-recorded, or they use microexpression analysis and 3D depth detection to confirm a live human presence. Providers including iProov, ID R&D, and Pindrop have deployed these capabilities.

Any authentication system that is reachable by a synthetic voice or image should be assessed for deepfake resilience: banking voice authentication, automated financial approval systems triggered by voice command, and video-based identity verification. The assessment should not be an internal review – it should be a penetration test using currently available deepfake tools.

Detection tools for audio and video deepfakes – Sensity AI (now Clarity), Truepic, Resemble Detect – use technical analysis to identify markers of AI synthesis. These tools are useful for post-incident forensic analysis and for validating suspicious media, but they are not real-time defences against live fraud calls. The reliability of detection tools varies with the quality of the synthetic media; high-quality deepfakes are more difficult to detect algorithmically.

Encrypted communications platforms with identity verification (Signal, Wickr Enterprise, Microsoft Teams with verified identity) reduce the attack surface compared to unencrypted calls and open video conferencing. The attacker needs an unencrypted channel that allows a synthetic voice or video to be presented convincingly; secure platforms with session validation make this harder.

Procedural Controls: The Most Reliable Layer

The most reliable defence against deepfake fraud is not technical – it is procedural. A mandatory out-of-band verification protocol for any unusual financial instruction removes the attack vector regardless of how convincing the deepfake is.

The protocol works as follows. Any request for a financial transfer above a defined threshold that arrives outside normal authorisation channels – or any request that is framed as unusual, urgent, and confidential – must be verified by calling the requester on a known, pre-established direct number. Not a number provided in the suspicious communication. Not a callback to the inbound call. A number from the corporate directory that is known to be correct.

This protocol must be documented, communicated to all finance and operations staff with payment authority, and tested. Periodic simulated deepfake attempts – a staff training exercise using voice cloning tools, with known test parameters and debrief – are a practical way to validate whether the protocol is actually being followed.

The second control is dual authorisation: any transfer above a material threshold requires a second approver who is not the person who received the instruction. This is already best practice for standard financial controls. It applies equally to deepfake scenarios, because it is significantly harder to impersonate two individuals convincingly in the same operation.

The third control is a cooling-off provision for urgent requests: a mandatory delay of a specified time (commonly fifteen to thirty minutes) for any transfer framed as urgent by an inbound communication. This creates space for verification. Most deepfake frauds exploit urgency as a social engineering mechanism – a procedural delay breaks that mechanism.

The Social Engineering Dimension

The technical sophistication of deepfake audio and video is only part of the attack. The other part is social engineering. The attacker designs the conversation to make the target feel unable to question, verify, or delay. Standard social engineering techniques documented in BEC attacks include: creating a sense of personal accountability (“you are the only person I can trust with this”), invoking confidentiality to prevent consultation (“do not discuss this with anyone else yet”), and applying time pressure (“this must be done in the next thirty minutes”).

These techniques work independently of whether the voice is synthetic or genuine. Training finance and operations staff to recognise and resist social engineering pressure is as important as technical authentication upgrades. The NCSC’s guidance on social engineering, and the FBI’s guidance on BEC, both contain practical training frameworks.

An employee who refuses to transfer funds without following the verification protocol – even when the voice sounds exactly like the CEO and the instruction sounds entirely plausible – is the most effective defence in the chain. Building a culture where that refusal is not just permitted but explicitly expected and supported is a management responsibility.

Post-Incident Response

If a deepfake fraud occurs – whether the transfer was made or not – the immediate response matters.

Preserve all evidence before taking any remediation action. The audio recording of the call, the video recording of the conference, the email chain that preceded it, and the bank transfer records must be secured and must not be altered. Contact the NCA (National Crime Agency, via Action Fraud) and the relevant bank immediately. Banks can sometimes execute a recall if the receiving institution is notified within hours. Contact your cyber insurance provider before forensic steps are taken – insurers have specific preservation requirements.

The forensic analysis of deepfake media can identify technical markers of AI synthesis that support the criminal investigation. These markers include specific artefacts in audio waveforms associated with text-to-speech synthesis, visual inconsistencies in video around hairlines, teeth, and background edges, and metadata anomalies in video files. A specialist digital forensics firm should conduct this analysis.

For the broader digital security context for executives, see the guide to executive digital security on international travel. The insider threat dimension – employees who may assist fraud by providing insider information about financial processes or personnel – is addressed in insider threat and corporate security.

Summary

Key takeaways

1
1
Out-of-band verification is the single most effective defence

No technical control prevents a convincing deepfake from being created. The operational defence is procedural: require independent verification of any unusual financial instruction through a separate, pre-established communications channel. This means calling a known direct number -- not a number provided in the suspicious communication -- to confirm the instruction. This protocol must be mandatory, documented, and tested regularly. A surprised finance officer acting on a plausible CEO video call is the fraud's attack vector.

2
2
Public audio and video exposure is the attacker's source material

Every recorded conference talk, press interview, earnings call, and investor day appearance that an executive has made is potential training data for a voice or face clone. This does not mean executives should never speak publicly. It means their public communications profile should be factored into their threat assessment, and that particularly sensitive communications -- board calls, financial approval instructions, HR decisions -- should occur in channels with strong verification protocols, not open video conferencing platforms.

3
3
Authentication systems must be deepfake-resilient

Any authentication system that could be reached by a synthetic voice or image -- voice banking authentication, video-based identity verification, automated approval systems triggered by voice command -- should be assessed for deepfake resilience. The key capability to look for is liveness detection: does the system challenge the user in a way that distinguishes a live person from a playback? NCSC guidance on authentication hardening is a useful reference. Biometric-only systems without liveness detection should be supplemented with additional factors.

4
4
The fraud uses social engineering, not just technology

Deepfake audio and video are most effective when combined with context that makes the instruction plausible. Criminals research the company before the call: they may know the names of senior staff, the current financial situation, and even specifics from publicly available information. The video or call is designed to seem urgent, unusual, and confidential -- classic social engineering pressure tactics. Training that teaches finance teams to recognise social engineering pressure, independent of the technical deepfake element, addresses the human dimension of the attack.

5
5
Post-incident response must include forensic evidence preservation

If a deepfake fraud attempt occurs -- successful or not -- the audio or video recording should be preserved as forensic evidence. Do not delete. Notify the NCA or Action Fraud (UK) and the relevant bank immediately. Contact your cyber insurer before taking any remediation action that might alter evidence. The forensic analysis of deepfake media can establish technical markers of AI generation that support the criminal investigation and, critically, the cyber insurance claim.

FAQ

Frequently Asked Questions

In the most significant documented cases, criminals use AI-generated audio or video to impersonate a senior executive during a phone call or video conference, instructing a finance employee to transfer funds. In the Hong Kong case of February 2024, a finance officer was directed via a deepfake video conference to transfer HKD 200 million (approximately USD 25 million) after seeing and hearing what appeared to be their CFO and other company officials. In a UK case in 2019, a CEO’s voice was cloned to direct a EUR 220,000 transfer. The fraud combines social engineering with synthetic media technology.

Basic biometric systems – those relying solely on voice or facial matching without liveness detection – can be vulnerable to high-quality deepfake attacks. Advanced liveness detection systems (used by providers including iProov and ID R&D) use random challenge-response mechanisms, 3D depth sensing, and microexpression analysis to detect synthetic media. Voice authentication systems that rely on a static voiceprint are more vulnerable than those using dynamic challenge phrases. Any authentication system that could potentially be reached by an AI-generated voice or image should be assessed for deepfake resilience.

The most reliable control is an out-of-band verification protocol: any request for financial transfer above a set threshold must be independently verified via a separate pre-established channel – ideally a known direct phone number, not an inbound call. This verification must be done regardless of how convincing the original instruction appeared. The protocol should be documented, trained, and tested with periodic simulated attempts. A second authoriser for large transfers provides an additional layer.

Yes. Using deepfake technology to commit fraud falls under the Fraud Act 2006 (fraud by false representation, sections 2-4), which carries up to ten years’ custody. The creation of non-consensual intimate deepfakes was separately criminalised under the Online Safety Act 2023. The use of deepfakes in financial fraud cases is prosecuted under the Fraud Act; the NCSC and NCA have both published guidance on AI-enabled fraud, and the FCA has issued warnings to regulated firms about voice cloning attacks on authentication systems.

Open-source material is the primary source. A senior executive who has spoken at industry conferences, given press interviews, appeared on YouTube recordings, or participated in podcast recordings has provided a substantial voice and video training dataset, entirely in the public domain. Sixty to ninety seconds of audio is generally sufficient for current voice cloning tools to produce convincing output. LinkedIn profiles, company websites, and social media accounts provide facial image datasets. Executives who have a high public profile are the most at risk.
Get in Touch

Request a Consultation

Describe your security requirements below. All enquiries are confidential and handled by licensed consultants.

Confidential. Your details are never shared with third parties.