Fake Voice Detection Service
Detect Fake Voices
with Scientific Precision.
BR SYSTEMS' VoiceGuard Analytics combines machine learning and deep learning in a multi-stage approach to detect AI-synthesized voices with high accuracy. Simply send us your audio files by email to receive a detailed analysis report.
Threats from AI Voice Synthesis
Advanced TTS systems such as XTTS v2 and VALL-E enable anyone to create highly realistic synthetic voices that convincingly imitate real humans.
Voice Impersonation
Attacks using synthesized voices of specific individuals to bypass identity verification and authentication systems are becoming a real threat.
Audio Deepfakes
Fake audio of politicians and public figures is spreading, causing serious damage to social credibility and reputation.
Phone & Business Fraud
Fraudulent calls impersonating family members or supervisors are increasing, making it difficult to distinguish them from genuine voices.
Content Authenticity
Verifying the authenticity of interviews, testimonies, and recordings has become increasingly difficult, undermining legal and journalistic trust.
Two Analysis Services
Choose between a universal model and a speaker-specific model based on your use case and accuracy requirements.
Universal Fake Voice Detection
No speaker information required — immediate analysis available. A general-purpose model supporting multiple speakers and TTS engines analyzes whether submitted audio is synthesized.
- No speaker registration — same-day analysis available
- 197-dimensional acoustic features + GradientBoosting
- Dual judgment with RawNet2 deep learning model
- Detailed report including ROC-AUC, EER, and confidence scores
- Batch analysis (multiple files at once) supported
Personalized Fake Voice Detection
Pre-register voice samples of the target speaker to build a speaker-specific high-accuracy model. Particularly effective for impersonation detection.
- Pre-register authentic voice samples (Real)
- Achieve extremely high accuracy with speaker-specific model
- Speaker verification via ECAPA-TDNN embeddings
- Includes Threshold Analysis detailed report
- Continuous model update option available
Completed in 4 Steps
Simply send your audio files by email to receive a comprehensive analysis report.
Inquiry
Tell us about the audio to be analyzed, quantity, and purpose. We will provide a quote the same day.
Send Audio Files
Send the target audio files (WAV recommended) by email.
Multi-Stage Analysis
Precision analysis using 197-dimensional acoustic features and RawNet2 deep learning.
Report Delivery
Comprehensive report including ROC curves, AUC, feature importance, and judgment rationale.
Analysis Technology Overview
A multi-stage approach combining machine learning and deep learning achieves high-accuracy judgments.
Feature-Based Analysis
Extracts 197-dimensional acoustic features and classifies using GradientBoosting. Multi-dimensional feature engineering combining MFCC, LFCC, CQCC, Group Delay, and Mel statistics achieves high explainability.
Deep Learning Model (RawNet2 Official)
An end-to-end neural network that takes raw waveforms directly as input. The SincConv + Channel Attention + ResBlocks + GRU architecture learns subtle voice characteristics that feature-based approaches cannot capture. Occlusion Sensitivity visualization shows which time-frequency regions contributed to the judgment.
Research & Publication
Peer-Reviewed Paper in Preparation
This system has been developed as academic research, with continuous performance validation on the world-standard benchmark ASVspoof 2019 LA. Using the official RawNet2 implementation (Tak et al., ICASSP 2021), we achieved EER = 4.487%, min t-DCF = 0.12352 (ASVspoof 2019 LA evaluation set, 71,237 samples), significantly outperforming the official baseline (LFCC+GMM EER≈8%). We are currently preparing a manuscript for submission to an international peer-reviewed journal (IEEE Access). This is a unique research program jointly developing a Japanese neural TTS system (BR-TTS NNW) and its corresponding fake voice detector (BR-FVD) within a unified framework.
(7-speaker validation)
(Equal Error Rate)
RawNet2 Official Implementation
Analysis Report Contents
We provide detailed reports that scientifically visualize the basis for judgments, not just a simple yes/no answer.
ROC Curve & AUC
Receiver Operating Characteristic curve. Quantifies model discrimination performance using AUC and EER.
Score Distribution
Visualization of Real/Synthetic score distributions. Intuitively shows the degree of separation between the two classes.
Feature Importance
Importance ranking of 197-dimensional features. Shows which acoustic features served as the basis for the judgment.
Threshold Analysis
Detailed threshold-based classification. Includes individual indicator evaluations for Jitter, Shimmer, Spectral features, etc.
RawNet2 Deep Score + Occlusion
Deep learning model synthetic probability score with Mel Spectrogram x Occlusion Sensitivity visualization in 4 panels, showing which time-frequency regions contributed to the judgment.
Summary CSV
A CSV report listing synthetic probability scores, predicted labels, and confidence levels for each file.
Simple Pricing
We offer flexible plans tailored to the number of files and use case. Please feel free to contact us for a consultation.
- Universal FVD analysis
- ROC & Score Distribution report
- Summary CSV
- Delivery: within 3 business days
- Universal FVD batch analysis
- RawNet2 deep learning judgment
- Full report set (6 types)
- Feature Importance analysis
- Delivery: within 5 business days
- Speaker-specific model construction
- Personalized FVD analysis
- Full report set (6 types)
- Continuous model update option
- Delivery: upon consultation
Frequently Asked Questions
Contact Us
For service inquiries or quote requests, please feel free to reach out. We typically respond within one business day.
info@brsystems.jp