Acoustic fingerprinting via Meyda.js + immutable ownership on Polygon. Your voice, cryptographically yours.
Upload or record audio. Meyda.js extracts MFCC features locally and generates a SHA-256 voice fingerprint.
→Connect MetaMask and register your hash on Polygon Amoy. Ownership is permanently recorded in the smart contract.
→Create on-chain licenses with revenue share and expiry. Anyone can verify usage rights against the ledger for free.
MFCC ×13 + spectral centroid + RMS energy. Your voice's acoustic fingerprint, not just a file hash.
Real contract at 0xeb848f213F... on Polygon Amoy. Every tx verifiable.
Raw audio never leaves your device. Only the cryptographic fingerprint is stored on-chain.
Compare two recordings and get a cosine similarity score. Detect if audio matches a registered voiceprint.
This is a proof-of-concept demo for educational purposes only. It runs on Polygon Amoy testnet (not mainnet) using test tokens with no real monetary value. It should not be used for actual voice rights management or legal ownership claims.
What MFCC can do: Two recordings of the same person saying the same phrase produce similar MFCC vectors, enabling meaningful similarity comparison. The feature vector captures the acoustic shape of a voice across 13 dimensions.
What MFCC cannot do: The SHA-256 hash of MFCC features is sensitive to recording conditions. A different microphone, background noise, or speaking distance will produce a different hash for the same speaker. Our hash is closer to a "session fingerprint" than a universal voice identity.
Production-grade voice biometrics would require a deep speaker embedding model (x-vector or d-vector architecture) trained on thousands of speakers — technically feasible via TensorFlow.js but beyond this demo's scope.
As AI voice cloning becomes trivially accessible, creators need a trustless way to prove voice ownership. Voiceprint Auth uses acoustic feature extraction and blockchain immutability to establish verifiable voice identity without storing any raw audio.
MFCC (Mel-Frequency Cepstral Coefficients) captures the acoustic shape of a voice by mapping its frequency spectrum to the Mel scale — mimicking human hearing perception — and extracting 13 coefficients that describe the spectral character. These 13 numbers are a compact mathematical signature of what your voice sounds like.
A raw SHA-256 of an audio file changes if even one bit changes — useless for voice comparison. Meyda.js extracts acoustic features (MFCC ×13, spectral centroid, RMS energy) that capture the character of a voice, not the exact bytes. We analyze hundreds of frames, average them, then hash the result.
Audio is processed entirely in the browser via Web Audio API + Meyda.js. No audio data is ever transmitted to any server. Only the cryptographic fingerprint — a 64-character hex string — is written to the blockchain.