
Security Intelligence
Deepfake and Impersonation Fraud Security for Executives | CloseProtectionHire
Deepfake audio and video are now being used to defraud companies and impersonate executives. Security professionals explain the threat and the protective measures available.
Written by James Whitfield, Senior Security Consultant
Deepfake and Impersonation Fraud Security for Executives
In February 2024, a finance officer at a multinational firm in Hong Kong transferred HKD 200 million – approximately USD 25 million – after participating in what appeared to be a video conference with the company’s CFO and several senior colleagues. All of them were deepfakes. The officer had initially been suspicious of a phishing email but was reassured by the video call. He made the transfer. By the time the fraud was confirmed, the money was gone.
This was not an outlier. Deepfake fraud has been documented since 2019, when a UK energy company CEO transferred EUR 220,000 after receiving what he believed was a phone call from his German parent company’s chief executive. The voice was an AI clone. The fraud took minutes.
The technology is not exotic. Voice cloning tools capable of producing convincing output from sixty seconds of audio sample are commercially available and, in some cases, free to use. Video deepfake quality has improved dramatically with diffusion model technology. The barrier to conducting a deepfake fraud is no longer technical capability – it is the targeting intelligence to make the deception convincing.
This article addresses the threat model, the technical and procedural defences available, and the executive protection implications of a threat that is growing in frequency and sophistication.
How the Fraud Works
Deepfake fraud against companies follows a consistent methodology. The first phase is reconnaissance. The attacker identifies a target company with sufficient public information about its leadership team – company website, LinkedIn, Companies House filings, press coverage – to make an impersonation convincing. They identify the target employee: typically a finance officer with payment authority, someone senior enough to authorise a transfer but junior enough that they might not question a direct instruction from the CEO or CFO.
The second phase is material acquisition. Voice cloning requires audio samples of the person being impersonated. For a senior executive who has spoken at conferences, participated in earnings calls, or appeared in video interviews, this material is often freely available in quantity. Sixty to ninety seconds is a workable minimum for current voice cloning tools; several minutes of training audio produces substantially better output. Video deepfakes require a larger image and video dataset but the same principle applies: public appearances are the source.
The third phase is the attack itself. The fraudster contacts the target employee by phone or video call, using the cloned voice and, in more sophisticated cases, real-time video synthesis to present an image of the impersonated executive. The communication is designed to be urgent, confidential, and outside normal process – “we have a confidential acquisition, this must be processed today, do not discuss with others.” The social engineering pressure is as important as the technical deception.
The fourth phase, if the fraud succeeds, is rapid movement of the transferred funds through multiple accounts and jurisdictions to obstruct recovery. The window between transfer and detection is typically several hours to a day; during this time, banks can sometimes execute a recall. Once the money has been moved through a cryptocurrency exchange or a jurisdiction with limited enforcement cooperation, recovery is near-impossible.
The Scale of the Problem
The FBI’s Internet Crime Complaint Center (IC3) Annual Report for 2023 recorded USD 2.9 billion in business email compromise (BEC) losses in the United States, making it the highest-value cybercrime category by financial loss. BEC includes traditional email fraud and is increasingly including voice and video impersonation. The FCA has issued specific warnings to UK regulated firms about voice cloning attacks on authentication systems.
The NCSC’s 2024 Annual Review cited AI-enabled fraud – including voice synthesis and deepfake media – as a growing threat category. Europol’s 2024 Internet Organised Crime Threat Assessment (IOCTA) identified deepfake fraud as an emerging priority.
These are not figures for rare, highly sophisticated attacks on large corporations only. Voice cloning tools that produce convincing audio are available as commercial services with no meaningful access controls. The attack requires social engineering capability and targeting research – skills that criminal groups already possess from years of BEC and vishing operations. Deepfake technology adds a layer of credibility to the attack, not a qualitative change in criminal method.
Why Executives Are the Primary Target
The deepfake fraud targets executives for two reasons. First, impersonating a senior executive creates authority that overrides normal financial controls. An instruction that appears to come from the CEO or CFO carries a presumption of legitimacy that a junior employee will typically not challenge – particularly when the instruction is framed as urgent and confidential. Second, senior executives have the highest volume of public audio and video available for training.
An executive who has spoken at industry conferences, given media interviews, participated in recorded earnings calls, appeared on company video content, or joined podcasts has created a training dataset entirely in the public domain. Security awareness of this dynamic does not require executives to stop all public communication. It does require that the communications profile be factored into the threat assessment, and that specific channels – those used for financial authorisations, board decisions, and HR matters – have robust verification controls that do not rely solely on voice or face recognition.
LinkedIn profiles and company website biographies also contribute. They tell the attacker the names of colleagues, reporting lines, and organisational context – information that makes the impersonation conversation more convincing because the attacker can reference known relationships and ongoing business matters.
Technical Defences
The primary technical defence against deepfake audio is liveness detection. Standard voice authentication systems that match against a stored voiceprint are vulnerable to a good voice clone. Advanced systems use dynamic challenge-response: they ask the user to repeat a random phrase they could not have pre-recorded, or they use microexpression analysis and 3D depth detection to confirm a live human presence. Providers including iProov, ID R&D, and Pindrop have deployed these capabilities.
Any authentication system that is reachable by a synthetic voice or image should be assessed for deepfake resilience: banking voice authentication, automated financial approval systems triggered by voice command, and video-based identity verification. The assessment should not be an internal review – it should be a penetration test using currently available deepfake tools.
Detection tools for audio and video deepfakes – Sensity AI (now Clarity), Truepic, Resemble Detect – use technical analysis to identify markers of AI synthesis. These tools are useful for post-incident forensic analysis and for validating suspicious media, but they are not real-time defences against live fraud calls. The reliability of detection tools varies with the quality of the synthetic media; high-quality deepfakes are more difficult to detect algorithmically.
Encrypted communications platforms with identity verification (Signal, Wickr Enterprise, Microsoft Teams with verified identity) reduce the attack surface compared to unencrypted calls and open video conferencing. The attacker needs an unencrypted channel that allows a synthetic voice or video to be presented convincingly; secure platforms with session validation make this harder.
Procedural Controls: The Most Reliable Layer
The most reliable defence against deepfake fraud is not technical – it is procedural. A mandatory out-of-band verification protocol for any unusual financial instruction removes the attack vector regardless of how convincing the deepfake is.
The protocol works as follows. Any request for a financial transfer above a defined threshold that arrives outside normal authorisation channels – or any request that is framed as unusual, urgent, and confidential – must be verified by calling the requester on a known, pre-established direct number. Not a number provided in the suspicious communication. Not a callback to the inbound call. A number from the corporate directory that is known to be correct.
This protocol must be documented, communicated to all finance and operations staff with payment authority, and tested. Periodic simulated deepfake attempts – a staff training exercise using voice cloning tools, with known test parameters and debrief – are a practical way to validate whether the protocol is actually being followed.
The second control is dual authorisation: any transfer above a material threshold requires a second approver who is not the person who received the instruction. This is already best practice for standard financial controls. It applies equally to deepfake scenarios, because it is significantly harder to impersonate two individuals convincingly in the same operation.
The third control is a cooling-off provision for urgent requests: a mandatory delay of a specified time (commonly fifteen to thirty minutes) for any transfer framed as urgent by an inbound communication. This creates space for verification. Most deepfake frauds exploit urgency as a social engineering mechanism – a procedural delay breaks that mechanism.
The Social Engineering Dimension
The technical sophistication of deepfake audio and video is only part of the attack. The other part is social engineering. The attacker designs the conversation to make the target feel unable to question, verify, or delay. Standard social engineering techniques documented in BEC attacks include: creating a sense of personal accountability (“you are the only person I can trust with this”), invoking confidentiality to prevent consultation (“do not discuss this with anyone else yet”), and applying time pressure (“this must be done in the next thirty minutes”).
These techniques work independently of whether the voice is synthetic or genuine. Training finance and operations staff to recognise and resist social engineering pressure is as important as technical authentication upgrades. The NCSC’s guidance on social engineering, and the FBI’s guidance on BEC, both contain practical training frameworks.
An employee who refuses to transfer funds without following the verification protocol – even when the voice sounds exactly like the CEO and the instruction sounds entirely plausible – is the most effective defence in the chain. Building a culture where that refusal is not just permitted but explicitly expected and supported is a management responsibility.
Post-Incident Response
If a deepfake fraud occurs – whether the transfer was made or not – the immediate response matters.
Preserve all evidence before taking any remediation action. The audio recording of the call, the video recording of the conference, the email chain that preceded it, and the bank transfer records must be secured and must not be altered. Contact the NCA (National Crime Agency, via Action Fraud) and the relevant bank immediately. Banks can sometimes execute a recall if the receiving institution is notified within hours. Contact your cyber insurance provider before forensic steps are taken – insurers have specific preservation requirements.
The forensic analysis of deepfake media can identify technical markers of AI synthesis that support the criminal investigation. These markers include specific artefacts in audio waveforms associated with text-to-speech synthesis, visual inconsistencies in video around hairlines, teeth, and background edges, and metadata anomalies in video files. A specialist digital forensics firm should conduct this analysis.
For the broader digital security context for executives, see the guide to executive digital security on international travel. The insider threat dimension – employees who may assist fraud by providing insider information about financial processes or personnel – is addressed in insider threat and corporate security.
Key takeaways
Out-of-band verification is the single most effective defence
No technical control prevents a convincing deepfake from being created. The operational defence is procedural: require independent verification of any unusual financial instruction through a separate, pre-established communications channel. This means calling a known direct number -- not a number provided in the suspicious communication -- to confirm the instruction. This protocol must be mandatory, documented, and tested regularly. A surprised finance officer acting on a plausible CEO video call is the fraud's attack vector.
Public audio and video exposure is the attacker's source material
Every recorded conference talk, press interview, earnings call, and investor day appearance that an executive has made is potential training data for a voice or face clone. This does not mean executives should never speak publicly. It means their public communications profile should be factored into their threat assessment, and that particularly sensitive communications -- board calls, financial approval instructions, HR decisions -- should occur in channels with strong verification protocols, not open video conferencing platforms.
Authentication systems must be deepfake-resilient
Any authentication system that could be reached by a synthetic voice or image -- voice banking authentication, video-based identity verification, automated approval systems triggered by voice command -- should be assessed for deepfake resilience. The key capability to look for is liveness detection: does the system challenge the user in a way that distinguishes a live person from a playback? NCSC guidance on authentication hardening is a useful reference. Biometric-only systems without liveness detection should be supplemented with additional factors.
The fraud uses social engineering, not just technology
Deepfake audio and video are most effective when combined with context that makes the instruction plausible. Criminals research the company before the call: they may know the names of senior staff, the current financial situation, and even specifics from publicly available information. The video or call is designed to seem urgent, unusual, and confidential -- classic social engineering pressure tactics. Training that teaches finance teams to recognise social engineering pressure, independent of the technical deepfake element, addresses the human dimension of the attack.
Post-incident response must include forensic evidence preservation
If a deepfake fraud attempt occurs -- successful or not -- the audio or video recording should be preserved as forensic evidence. Do not delete. Notify the NCA or Action Fraud (UK) and the relevant bank immediately. Contact your cyber insurer before taking any remediation action that might alter evidence. The forensic analysis of deepfake media can establish technical markers of AI generation that support the criminal investigation and, critically, the cyber insurance claim.
Frequently Asked Questions
Request a Consultation
Describe your security requirements below. All enquiries are confidential and handled by licensed consultants.
Your enquiry has been received. A security consultant will contact you within 24 hours to discuss your requirements.
