2026-01-02 12:24:48 +01:00
# stt-ptt Home Manager Module
Push to Talk Speech to Text for Home Manager.
## Overview
This module configures stt-ptt, a push-to-talk speech-to-text tool using whisper.cpp. It handles model downloads, environment configuration, and package installation.
## Quick Start
``` nix
{ config , . . . }: {
imports = [ m3ta-nixpkgs . homeManagerModules . default ] ;
cli . stt-ptt = {
enable = true ;
} ;
}
```
This will:
- Install stt-ptt with default whisper-cpp
- Download the `ggml-large-v3-turbo` model on first activation
- Set environment variables for model path and notification timeout
## Module Options
### `cli.stt-ptt.enable`
Enable the stt-ptt module.
- Type: `boolean`
- Default: `false`
### `cli.stt-ptt.whisperPackage`
The whisper-cpp package to use for transcription.
- Type: `package`
- Default: `pkgs.whisper-cpp`
**Pre-built variants: **
``` nix
# CPU (default)
whisperPackage = pkgs . whisper-cpp ;
# Vulkan GPU acceleration (pre-built)
whisperPackage = pkgs . whisper-cpp-vulkan ;
```
**Override options ** (can be combined):
| Option | Description |
|--------|-------------|
| `cudaSupport` | NVIDIA CUDA acceleration |
| `rocmSupport` | AMD ROCm acceleration |
| `vulkanSupport` | Vulkan GPU acceleration |
| `coreMLSupport` | Apple CoreML (macOS only) |
| `metalSupport` | Apple Metal (macOS ARM only) |
``` nix
# NVIDIA CUDA support
whisperPackage = pkgs . whisper-cpp . override { cudaSupport = true ; } ;
# AMD ROCm support
whisperPackage = pkgs . whisper-cpp . override { rocmSupport = true ; } ;
# Vulkan support (manual override)
whisperPackage = pkgs . whisper-cpp . override { vulkanSupport = true ; } ;
```
### `cli.stt-ptt.model`
The Whisper model to use. Models are automatically downloaded from HuggingFace on first activation.
- Type: `string`
- Default: `"ggml-large-v3-turbo"`
Available models (sorted by size):
| Model | Size | Notes |
|-------|------|-------|
| `ggml-tiny` | 75MB | Fastest, lowest quality |
| `ggml-tiny.en` | 75MB | English-only, slightly faster |
| `ggml-base` | 142MB | Fast, basic quality |
| `ggml-base.en` | 142MB | English-only |
| `ggml-small` | 466MB | Balanced speed/quality |
| `ggml-small.en` | 466MB | English-only |
| `ggml-medium` | 1.5GB | Good quality |
| `ggml-medium.en` | 1.5GB | English-only |
| `ggml-large-v1` | 2.9GB | High quality (original) |
| `ggml-large-v2` | 2.9GB | High quality (improved) |
| `ggml-large-v3` | 2.9GB | Highest quality |
| `ggml-large-v3-turbo` | 1.6GB | High quality, optimized speed (recommended) |
Quantized versions (`q5_0` , `q5_1` , `q8_0` ) are also available for reduced size.
### `cli.stt-ptt.notifyTimeout`
Notification timeout in milliseconds for the recording indicator.
- Type: `integer`
- Default: `3000`
- Example: `5000` (5 seconds), `0` (persistent)
2026-01-10 19:12:45 +01:00
### `cli.stt-ptt.language`
Language for speech recognition. Use "auto" for automatic language detection, or specify a language code for better accuracy.
- Type: `enum ["auto", "en", "es", "fr", "de", "it", "pt", "ru", "zh", "ja", "ko", "ar", "hi", "tr", "pl", "nl", "sv", "da", "fi", "no", "vi", "th", "id", "uk", "cs"]`
- Default: `"auto"`
**Auto-detection ** : When set to "auto", whisper.cpp analyzes the audio to determine the spoken language automatically.
**Language specification ** : Specifying a language code improves transcription accuracy if you know the language in advance.
``` nix
# Automatic language detection (default)
language = " a u t o " ;
# Force English transcription
language = " e n " ;
# Spanish transcription
language = " e s " ;
```
**Common language codes: **
| Code | Language |
|------|----------|
| `en` | English |
| `es` | Spanish |
| `fr` | French |
| `de` | German |
| `zh` | Chinese |
| `ja` | Japanese |
| `ko` | Korean |
whisper.cpp supports 100+ languages. See whisper.cpp documentation for the full list.
2026-01-02 12:24:48 +01:00
## Usage
After enabling, bind `stt-ptt start` and `stt-ptt stop` to a key:
``` bash
# Start recording
stt-ptt start
# Stop recording and transcribe (types result)
stt-ptt stop
```
### Keybinding Examples
#### Hyprland
``` nix
wayland . windowManager . hyprland . settings = {
bind = [
" S U P E R , V , e x e c , s t t - p t t s t a r t "
] ;
bindr = [
" S U P E R , V , e x e c , s t t - p t t s t o p "
] ;
} ;
```
Or in `hyprland.conf` :
``` conf
# Press to start recording, release to transcribe
bind = SUPER, V, exec, stt-ptt start
bindr = SUPER, V, exec, stt-ptt stop
```
#### Sway
``` conf
bindsym --no-repeat $mod+v exec stt-ptt start
bindsym --release $mod+v exec stt-ptt stop
```
## Configuration Examples
### Basic Setup
``` nix
cli . stt-ptt = {
enable = true ;
} ;
```
### Fast English Transcription
``` nix
cli . stt-ptt = {
enable = true ;
model = " g g m l - b a s e . e n " ;
notifyTimeout = 2000 ;
} ;
```
2026-01-10 19:12:45 +01:00
### Language-Specific Transcription
``` nix
cli . stt-ptt = {
enable = true ;
model = " g g m l - l a r g e - v 3 - t u r b o " ;
language = " e s " ; # Force Spanish transcription
} ;
```
2026-01-02 12:24:48 +01:00
### High Quality with NVIDIA GPU
``` nix
cli . stt-ptt = {
enable = true ;
model = " g g m l - l a r g e - v 3 " ;
whisperPackage = pkgs . whisper-cpp . override { cudaSupport = true ; } ;
} ;
```
### Vulkan GPU Acceleration
``` nix
cli . stt-ptt = {
enable = true ;
model = " g g m l - l a r g e - v 3 - t u r b o " ;
whisperPackage = pkgs . whisper-cpp-vulkan ;
} ;
```
### AMD GPU with ROCm
``` nix
cli . stt-ptt = {
enable = true ;
model = " g g m l - l a r g e - v 3 - t u r b o " ;
whisperPackage = pkgs . whisper-cpp . override { rocmSupport = true ; } ;
} ;
```
### Balanced Setup
``` nix
cli . stt-ptt = {
enable = true ;
model = " g g m l - s m a l l " ;
notifyTimeout = 3000 ;
} ;
```
## File Locations
| Path | Description |
|------|-------------|
| `~/.local/share/stt-ptt/models/` | Downloaded Whisper models |
| `~/.cache/stt-ptt/stt.wav` | Temporary audio recording |
| `~/.cache/stt-ptt/stt.pid` | PID file for recording process |
## Environment Variables
The module sets these automatically:
| Variable | Value |
|----------|-------|
| `STT_MODEL` | `~/.local/share/stt-ptt/models/<model>.bin` |
2026-01-10 19:12:45 +01:00
| `STT_LANGUAGE` | Configured language ("auto" by default) |
2026-01-02 12:24:48 +01:00
| `STT_NOTIFY_TIMEOUT` | Configured timeout in ms |
## Requirements
- Wayland compositor (wtype is Wayland-only)
- PipeWire for audio recording
- Desktop notification daemon
## Troubleshooting
### Model Download Failed
The model downloads on first `home-manager switch` . If it fails:
``` bash
# Manual download
mkdir -p ~/.local/share/stt-ptt/models
curl -L -o ~/.local/share/stt-ptt/models/ggml-large-v3-turbo.bin \
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin
```
### Transcription Too Slow
Use a smaller model or enable GPU acceleration:
``` nix
cli . stt-ptt = {
enable = true ;
model = " g g m l - t i n y . e n " ; # Much faster
} ;
```
### Text Not Appearing
1. Ensure you're on Wayland: `echo $XDG_SESSION_TYPE`
2. Check if wtype works: `wtype "test"`
3. Some apps may need focus; try clicking the text field first
## Related
- [stt-ptt Package ](../../../packages/stt-ptt.md ) - Package documentation
- [Using Modules Guide ](../../../guides/using-modules.md ) - Module usage patterns