Blog — NoxVerify

What Are Injection Attacks in KYC#

An injection attack in the context of identity verification occurs when an attacker intercepts the normal capture pipeline and substitutes fraudulent data for genuine camera input. Rather than presenting a real identity document and their own face to the camera, the attacker injects pre-prepared images or video into the verification flow. These attacks bypass the physical reality that KYC verification is designed to confirm.

Injection attacks have become the fastest-growing category of identity fraud. As liveness detection and document verification improve against physical presentation attacks (holding up a printed photo), attackers are shifting to digital injection methods that operate at the software layer. The barrier to entry is low: virtual camera software is freely available, and deepfake generation tools have become accessible to non-technical users.

Attack Vectors#

Virtual Camera Spoofing

Virtual camera software such as OBS Virtual Camera, ManyCam, or custom drivers presents itself to the operating system as a legitimate camera device. Any application requesting camera access, including a KYC verification SDK, receives the virtual camera feed instead of physical camera input. The attacker can play pre-recorded video, screen-share another application, or feed synthetic content through the virtual camera.

Document Replay Attacks

In a document replay attack, the attacker uses a high-resolution image of a stolen or fraudulently obtained identity document and presents it to the camera. Advanced variants display the document image on a high-DPI screen, which can fool basic document capture systems that do not test for screen recapture artifacts.

Deepfake Selfies

Deepfake technology allows attackers to generate realistic facial videos of another person. Using a reference photo from the identity document, an attacker can create a synthetic video that matches the document photo, then inject it via virtual camera. Advanced deepfakes can respond to liveness challenges in real-time, performing head turns and blinks on demand.

Screen Recapture

The attacker displays a document or selfie on one screen and captures it with the device camera pointed at that screen. This is a hybrid physical-digital attack that can defeat virtual camera detection while still using fraudulent content.

Injection attacks can be combined in sophisticated multi-stage attacks. An attacker might use a physical document replica for the document capture step and a deepfake injected via virtual camera for the selfie step, requiring your defenses to detect different attack types at different stages of the flow.

Defense Layers#

Effective defense against injection attacks requires multiple independent detection layers. No single technique is sufficient because attackers will find and exploit weaknesses in any individual defense. The principle of defense-in-depth applies: each layer is designed to catch attacks that slip through the previous layer.

Device Integrity Verification

The first defense layer verifies that the capture environment is trustworthy. On mobile platforms, this includes checking for rooting or jailbreaking (which enables camera API hooking), verifying the application's code integrity against tampering, detecting instrumentation frameworks such as Frida or Xposed that can intercept SDK function calls, and confirming that the camera API returns data from a physical hardware sensor rather than a virtual source.

Document Liveness

Document liveness verification confirms that the identity document is a physical object present in front of the camera, not a screen display or printed copy of a digital image. Techniques include: analyzing Moire pattern interference that occurs when a camera captures a screen; detecting the characteristic pixel grid of LCD/OLED displays; analyzing light reflection patterns that differ between plastic/polycarbonate documents and flat screens; and requesting the user to tilt the document, which produces different hologram and security feature responses on genuine documents.

Face Liveness and Anti-Spoofing

Face liveness detection, as discussed in our biometric verification article, must be robust against both physical and digital presentation attacks. Illumination-based challenges where the device screen projects colored light patterns onto the face provide strong defense because the reflected light pattern is extremely difficult to reproduce in a deepfake without knowledge of the specific challenge sequence. Server-side liveness evaluation prevents client-side bypass of liveness checks.

Frame Hash Chains

A frame hash chain links each captured video frame to its predecessor through cryptographic hashing, creating a tamper-evident sequence. If an attacker injects frames from a different source mid-stream, the hash chain breaks. The hash chain is initialized with a server-provided nonce, binding the capture session to a specific verification request and preventing replay of previously captured sessions.

Stream Integrity Verification#

Beyond individual frame integrity, the entire capture stream must be verified for consistency. This includes analyzing temporal characteristics such as frame rate stability, exposure consistency, and sensor noise patterns that differ between physical cameras and synthetic sources. Motion parallax analysis can detect flat surfaces (screens) by checking whether foreground and background elements shift appropriately with device movement.

Metadata analysis provides additional signals: physical cameras embed EXIF data including focal length, aperture, and sensor model that can be cross-referenced against known device profiles. Virtual cameras and injected streams may omit or fabricate this metadata in detectable ways.

Server-Side Validation#

Never rely solely on client-side validation. Any check performed on the user's device can be bypassed by a sufficiently motivated attacker who controls the device. All critical integrity and liveness decisions must be validated server-side against the raw captured data.

Server-side validation is the final arbiter of verification integrity. The server receives the captured frames, metadata, hash chain, and liveness challenge responses, then independently validates: hash chain integrity, liveness challenge response correctness, biometric comparison results, document authenticity signals, and cross-consistency between the document capture and selfie capture sessions.

Defense-in-Depth Architecture#

A complete defense-in-depth architecture for KYC verification layers these defenses so that an attacker must simultaneously defeat all layers to succeed:

Layer 1 (Device): Root/jailbreak detection, app integrity verification, virtual camera detection.
Layer 2 (Capture): Document liveness, face liveness with illumination challenge, frame hash chain initialization.
Layer 3 (Transport): Encrypted, certificate-pinned channel with session binding and replay prevention.
Layer 4 (Server): Independent liveness evaluation, biometric comparison, document authenticity analysis, hash chain verification.
Layer 5 (Analytics): Cross-session anomaly detection, device fingerprint clustering, velocity checks on repeated submission attempts.

Each layer operates independently and reports its own confidence signal. The final verification decision considers all layers, and a failure at any single layer is sufficient to flag a submission for review or rejection. This architecture ensures that even as attackers find ways to bypass individual defenses, the overall system maintains its integrity.

Securing Your KYC Pipeline Against Injection Attacks