- Add stt-ptt language support documentation - Add rofi-project-opener module documentation - Add rofi-project-opener package documentation - Update zellij-ps documentation - Update guides and reference patterns - Update AGENTS.md with latest commands
6.8 KiB
stt-ptt Home Manager Module
Push to Talk Speech to Text for Home Manager.
Overview
This module configures stt-ptt, a push-to-talk speech-to-text tool using whisper.cpp. It handles model downloads, environment configuration, and package installation.
Quick Start
{config, ...}: {
imports = [m3ta-nixpkgs.homeManagerModules.default];
cli.stt-ptt = {
enable = true;
};
}
This will:
- Install stt-ptt with default whisper-cpp
- Download the
ggml-large-v3-turbomodel on first activation - Set environment variables for model path and notification timeout
Module Options
cli.stt-ptt.enable
Enable the stt-ptt module.
- Type:
boolean - Default:
false
cli.stt-ptt.whisperPackage
The whisper-cpp package to use for transcription.
- Type:
package - Default:
pkgs.whisper-cpp
Pre-built variants:
# CPU (default)
whisperPackage = pkgs.whisper-cpp;
# Vulkan GPU acceleration (pre-built)
whisperPackage = pkgs.whisper-cpp-vulkan;
Override options (can be combined):
| Option | Description |
|---|---|
cudaSupport |
NVIDIA CUDA acceleration |
rocmSupport |
AMD ROCm acceleration |
vulkanSupport |
Vulkan GPU acceleration |
coreMLSupport |
Apple CoreML (macOS only) |
metalSupport |
Apple Metal (macOS ARM only) |
# NVIDIA CUDA support
whisperPackage = pkgs.whisper-cpp.override { cudaSupport = true; };
# AMD ROCm support
whisperPackage = pkgs.whisper-cpp.override { rocmSupport = true; };
# Vulkan support (manual override)
whisperPackage = pkgs.whisper-cpp.override { vulkanSupport = true; };
cli.stt-ptt.model
The Whisper model to use. Models are automatically downloaded from HuggingFace on first activation.
- Type:
string - Default:
"ggml-large-v3-turbo"
Available models (sorted by size):
| Model | Size | Notes |
|---|---|---|
ggml-tiny |
75MB | Fastest, lowest quality |
ggml-tiny.en |
75MB | English-only, slightly faster |
ggml-base |
142MB | Fast, basic quality |
ggml-base.en |
142MB | English-only |
ggml-small |
466MB | Balanced speed/quality |
ggml-small.en |
466MB | English-only |
ggml-medium |
1.5GB | Good quality |
ggml-medium.en |
1.5GB | English-only |
ggml-large-v1 |
2.9GB | High quality (original) |
ggml-large-v2 |
2.9GB | High quality (improved) |
ggml-large-v3 |
2.9GB | Highest quality |
ggml-large-v3-turbo |
1.6GB | High quality, optimized speed (recommended) |
Quantized versions (q5_0, q5_1, q8_0) are also available for reduced size.
cli.stt-ptt.notifyTimeout
Notification timeout in milliseconds for the recording indicator.
- Type:
integer - Default:
3000 - Example:
5000(5 seconds),0(persistent)
cli.stt-ptt.language
Language for speech recognition. Use "auto" for automatic language detection, or specify a language code for better accuracy.
- Type:
enum ["auto", "en", "es", "fr", "de", "it", "pt", "ru", "zh", "ja", "ko", "ar", "hi", "tr", "pl", "nl", "sv", "da", "fi", "no", "vi", "th", "id", "uk", "cs"] - Default:
"auto"
Auto-detection: When set to "auto", whisper.cpp analyzes the audio to determine the spoken language automatically.
Language specification: Specifying a language code improves transcription accuracy if you know the language in advance.
# Automatic language detection (default)
language = "auto";
# Force English transcription
language = "en";
# Spanish transcription
language = "es";
Common language codes:
| Code | Language |
|---|---|
en |
English |
es |
Spanish |
fr |
French |
de |
German |
zh |
Chinese |
ja |
Japanese |
ko |
Korean |
whisper.cpp supports 100+ languages. See whisper.cpp documentation for the full list.
Usage
After enabling, bind stt-ptt start and stt-ptt stop to a key:
# Start recording
stt-ptt start
# Stop recording and transcribe (types result)
stt-ptt stop
Keybinding Examples
Hyprland
wayland.windowManager.hyprland.settings = {
bind = [
"SUPER, V, exec, stt-ptt start"
];
bindr = [
"SUPER, V, exec, stt-ptt stop"
];
};
Or in hyprland.conf:
# Press to start recording, release to transcribe
bind = SUPER, V, exec, stt-ptt start
bindr = SUPER, V, exec, stt-ptt stop
Sway
bindsym --no-repeat $mod+v exec stt-ptt start
bindsym --release $mod+v exec stt-ptt stop
Configuration Examples
Basic Setup
cli.stt-ptt = {
enable = true;
};
Fast English Transcription
cli.stt-ptt = {
enable = true;
model = "ggml-base.en";
notifyTimeout = 2000;
};
Language-Specific Transcription
cli.stt-ptt = {
enable = true;
model = "ggml-large-v3-turbo";
language = "es"; # Force Spanish transcription
};
High Quality with NVIDIA GPU
cli.stt-ptt = {
enable = true;
model = "ggml-large-v3";
whisperPackage = pkgs.whisper-cpp.override { cudaSupport = true; };
};
Vulkan GPU Acceleration
cli.stt-ptt = {
enable = true;
model = "ggml-large-v3-turbo";
whisperPackage = pkgs.whisper-cpp-vulkan;
};
AMD GPU with ROCm
cli.stt-ptt = {
enable = true;
model = "ggml-large-v3-turbo";
whisperPackage = pkgs.whisper-cpp.override { rocmSupport = true; };
};
Balanced Setup
cli.stt-ptt = {
enable = true;
model = "ggml-small";
notifyTimeout = 3000;
};
File Locations
| Path | Description |
|---|---|
~/.local/share/stt-ptt/models/ |
Downloaded Whisper models |
~/.cache/stt-ptt/stt.wav |
Temporary audio recording |
~/.cache/stt-ptt/stt.pid |
PID file for recording process |
Environment Variables
The module sets these automatically:
| Variable | Value |
|---|---|
STT_MODEL |
~/.local/share/stt-ptt/models/<model>.bin |
STT_LANGUAGE |
Configured language ("auto" by default) |
STT_NOTIFY_TIMEOUT |
Configured timeout in ms |
Requirements
- Wayland compositor (wtype is Wayland-only)
- PipeWire for audio recording
- Desktop notification daemon
Troubleshooting
Model Download Failed
The model downloads on first home-manager switch. If it fails:
# Manual download
mkdir -p ~/.local/share/stt-ptt/models
curl -L -o ~/.local/share/stt-ptt/models/ggml-large-v3-turbo.bin \
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin
Transcription Too Slow
Use a smaller model or enable GPU acceleration:
cli.stt-ptt = {
enable = true;
model = "ggml-tiny.en"; # Much faster
};
Text Not Appearing
- Ensure you're on Wayland:
echo $XDG_SESSION_TYPE - Check if wtype works:
wtype "test" - Some apps may need focus; try clicking the text field first
Related
- stt-ptt Package - Package documentation
- Using Modules Guide - Module usage patterns