stt/README.md

1.2 KiB

Real-time Speech Recognition with Vosk and Zig

This project implements a minimal real-time speech-to-text application using Vosk and Zig.

Setup

Prerequisites

  • Zig 0.15.1 (configured via mise)
  • Nix development environment with C compilation tools, ALSA, and audio libraries

Vosk Model Download

The application uses the Vosk small English model for speech recognition:

Installation Steps

  1. Enter nix development environment: nix develop
  2. Download Vosk model: wget https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
  3. Extract model: unzip vosk-model-small-en-us-0.15.zip
  4. Build application: zig build
  5. Run: ./zig-out/bin/stt

Usage

The application will:

  • Initialize audio capture from default microphone
  • Load the Vosk speech recognition model
  • Process audio in real-time
  • Output recognized text to terminal
  • Exit on Ctrl+C

Dependencies

  • Vosk C API library
  • ALSA for audio capture
  • Standard C libraries for audio processing