Voice Activity Detection

๐ฏ Voice Activity Detection (VAD), or voice endpoint detection, identifies time segments in an audio signal containing speech. This is a critical preprocessing step for automatic speech recognition (ASR) and voice wake-up systems. This project lays the groundwork for my upcoming ASR project ๐คญ.
๐ Workflow Overview: The VAD pipeline processes a speech signal as follows:Preprocessing, Framing, Windowing, Feature Extraction, Binary Classification, Time-Domain Restoration
๐ป Project Highlights: I conducted extensive experiments comparing frame division methods (frame length and shift) and model performances, with rich visualizations. For details, see the report in ‘vad/latex/’. If you’re interested in voice technologies, let’s connect!
๐ For more details, please visit my blog VAD