DetectionForensicsSpectral Analysis
Detecting AI Voice Clones: Spectral Analysis Techniques
2023-11-028 min read
Detecting AI Voice Clones
Modern voice cloning tools have become incredibly realistic, but they are not perfect. They often leave behind subtle spectral artifacts that can be detected with the right tools.
Spectral Artifacts
When analyzing the spectrograms of synthetic audio, we often observe:
- High-frequency cutoff: Many models struggle to generate realistic high-frequency content.
- Checkerboard artifacts: Resulting from deconvolution operations in neural vocoders.
- Phase inconsistencies: The phase information in generated audio is often less coherent than in natural speech.
Tools for Analysis
- Mel-spectrograms: Visualizing the frequency content over time.
- Constant-Q Transform (CQT): Useful for analyzing musical or tonal content.
- Bispectral Analysis: detecting non-linearities introduced by the generation process.
Case Study: ElevenLabs
We analyzed 100 samples generated by ElevenLabs and found distinct patterns in the 8kHz-10kHz range that are absent in real human speech recordings of similar quality.