Files
nixpkgs/docs/packages/stt-ptt.md
2026-01-02 12:24:48 +01:00

4.8 KiB

stt-ptt

Push to Talk Speech to Text using Whisper.

Description

stt-ptt is a simple push-to-talk speech-to-text tool that uses whisper.cpp for transcription. It records audio via PipeWire, transcribes it using a local Whisper model, and types the result using wtype (Wayland).

Features

  • Push to Talk: Start/stop recording with simple commands
  • Local Processing: Uses whisper.cpp for fast, offline transcription
  • Wayland Native: Types transcribed text using wtype
  • Configurable: Model path and notification timeout via environment variables
  • Lightweight: Minimal dependencies, no cloud services

Installation

See stt-ptt Home Manager Module for the recommended setup with automatic model download.

Via Overlay

{pkgs, ...}: {
  home.packages = [pkgs.stt-ptt];
}

Direct Reference

{pkgs, ...}: {
  home.packages = [
    inputs.m3ta-nixpkgs.packages.${pkgs.system}.stt-ptt
  ];
}

Usage

Basic Commands

# Start recording
stt-ptt start

# Stop recording and transcribe
stt-ptt stop

Keybinding Setup

The tool is designed to be bound to a key (e.g., hold to record, release to transcribe).

Hyprland

# In your Hyprland config
wayland.windowManager.hyprland.settings = {
  bind = [
    # Press Super+V to start, release to stop and transcribe
    "SUPER, V, exec, stt-ptt start"
  ];
  bindr = [
    # Release trigger
    "SUPER, V, exec, stt-ptt stop"
  ];
};

Or in hyprland.conf:

bind = SUPER, V, exec, stt-ptt start
bindr = SUPER, V, exec, stt-ptt stop

Sway

# Hold to record, release to transcribe
bindsym --no-repeat $mod+v exec stt-ptt start
bindsym --release $mod+v exec stt-ptt stop

i3 (X11 - requires xdotool instead of wtype)

Note: stt-ptt uses wtype which is Wayland-only. For X11, you would need to modify the script to use xdotool.

Environment Variables

Variable Description Default
STT_MODEL Path to Whisper model file ~/.local/share/stt-ptt/models/ggml-large-v3-turbo.bin
STT_NOTIFY_TIMEOUT Notification timeout in ms 3000

Requirements

  • whisper-cpp: Speech recognition engine
  • wtype: Wayland text input (Wayland compositor required)
  • libnotify: Desktop notifications
  • pipewire: Audio recording

Model Setup

Download a Whisper model from HuggingFace:

# Create model directory
mkdir -p ~/.local/share/stt-ptt/models

# Download model (example: large-v3-turbo)
curl -L -o ~/.local/share/stt-ptt/models/ggml-large-v3-turbo.bin \
  https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin

Or use the Home Manager module which handles this automatically.

Available Models

Model Size Quality Speed
ggml-tiny / ggml-tiny.en 75MB Basic Fastest
ggml-base / ggml-base.en 142MB Good Fast
ggml-small / ggml-small.en 466MB Better Medium
ggml-medium / ggml-medium.en 1.5GB High Slower
ggml-large-v3-turbo 1.6GB High Fast
ggml-large-v3 2.9GB Highest Slowest

Models ending in .en are English-only and slightly faster for English text.

Platform Support

  • Linux with Wayland (primary)
  • Requires PipeWire for audio
  • X11 not supported (wtype is Wayland-only)

Build Information

  • Version: 0.1.0
  • Type: Shell script wrapper
  • License: MIT

Troubleshooting

Model Not Found

Error: Error: Model not found at /path/to/model

Solution: Download a model or use the Home Manager module:

curl -L -o ~/.local/share/stt-ptt/models/ggml-large-v3-turbo.bin \
  https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin

No Audio Recorded

Solution: Ensure PipeWire is running:

systemctl --user status pipewire

Text Not Typed

Solution: Ensure you're on Wayland and wtype has access:

# Check if running on Wayland
echo $XDG_SESSION_TYPE  # Should print "wayland"

Slow Transcription

Solution: Use a smaller model or enable GPU acceleration:

cli.stt-ptt = {
  enable = true;
  model = "ggml-base.en";  # Smaller, faster model
};

Or with GPU acceleration:

cli.stt-ptt = {
  enable = true;
  # Choose one:
  whisperPackage = pkgs.whisper-cpp-vulkan;  # Vulkan (pre-built)
  # whisperPackage = pkgs.whisper-cpp.override { cudaSupport = true; };  # NVIDIA
  # whisperPackage = pkgs.whisper-cpp.override { rocmSupport = true; };  # AMD
};