- Add stt-ptt language support documentation - Add rofi-project-opener module documentation - Add rofi-project-opener package documentation - Update zellij-ps documentation - Update guides and reference patterns - Update AGENTS.md with latest commands
313 lines
6.8 KiB
Markdown
313 lines
6.8 KiB
Markdown
# stt-ptt Home Manager Module
|
|
|
|
Push to Talk Speech to Text for Home Manager.
|
|
|
|
## Overview
|
|
|
|
This module configures stt-ptt, a push-to-talk speech-to-text tool using whisper.cpp. It handles model downloads, environment configuration, and package installation.
|
|
|
|
## Quick Start
|
|
|
|
```nix
|
|
{config, ...}: {
|
|
imports = [m3ta-nixpkgs.homeManagerModules.default];
|
|
|
|
cli.stt-ptt = {
|
|
enable = true;
|
|
};
|
|
}
|
|
```
|
|
|
|
This will:
|
|
- Install stt-ptt with default whisper-cpp
|
|
- Download the `ggml-large-v3-turbo` model on first activation
|
|
- Set environment variables for model path and notification timeout
|
|
|
|
## Module Options
|
|
|
|
### `cli.stt-ptt.enable`
|
|
|
|
Enable the stt-ptt module.
|
|
|
|
- Type: `boolean`
|
|
- Default: `false`
|
|
|
|
### `cli.stt-ptt.whisperPackage`
|
|
|
|
The whisper-cpp package to use for transcription.
|
|
|
|
- Type: `package`
|
|
- Default: `pkgs.whisper-cpp`
|
|
|
|
**Pre-built variants:**
|
|
|
|
```nix
|
|
# CPU (default)
|
|
whisperPackage = pkgs.whisper-cpp;
|
|
|
|
# Vulkan GPU acceleration (pre-built)
|
|
whisperPackage = pkgs.whisper-cpp-vulkan;
|
|
```
|
|
|
|
**Override options** (can be combined):
|
|
|
|
| Option | Description |
|
|
|--------|-------------|
|
|
| `cudaSupport` | NVIDIA CUDA acceleration |
|
|
| `rocmSupport` | AMD ROCm acceleration |
|
|
| `vulkanSupport` | Vulkan GPU acceleration |
|
|
| `coreMLSupport` | Apple CoreML (macOS only) |
|
|
| `metalSupport` | Apple Metal (macOS ARM only) |
|
|
|
|
```nix
|
|
# NVIDIA CUDA support
|
|
whisperPackage = pkgs.whisper-cpp.override { cudaSupport = true; };
|
|
|
|
# AMD ROCm support
|
|
whisperPackage = pkgs.whisper-cpp.override { rocmSupport = true; };
|
|
|
|
# Vulkan support (manual override)
|
|
whisperPackage = pkgs.whisper-cpp.override { vulkanSupport = true; };
|
|
```
|
|
|
|
### `cli.stt-ptt.model`
|
|
|
|
The Whisper model to use. Models are automatically downloaded from HuggingFace on first activation.
|
|
|
|
- Type: `string`
|
|
- Default: `"ggml-large-v3-turbo"`
|
|
|
|
Available models (sorted by size):
|
|
|
|
| Model | Size | Notes |
|
|
|-------|------|-------|
|
|
| `ggml-tiny` | 75MB | Fastest, lowest quality |
|
|
| `ggml-tiny.en` | 75MB | English-only, slightly faster |
|
|
| `ggml-base` | 142MB | Fast, basic quality |
|
|
| `ggml-base.en` | 142MB | English-only |
|
|
| `ggml-small` | 466MB | Balanced speed/quality |
|
|
| `ggml-small.en` | 466MB | English-only |
|
|
| `ggml-medium` | 1.5GB | Good quality |
|
|
| `ggml-medium.en` | 1.5GB | English-only |
|
|
| `ggml-large-v1` | 2.9GB | High quality (original) |
|
|
| `ggml-large-v2` | 2.9GB | High quality (improved) |
|
|
| `ggml-large-v3` | 2.9GB | Highest quality |
|
|
| `ggml-large-v3-turbo` | 1.6GB | High quality, optimized speed (recommended) |
|
|
|
|
Quantized versions (`q5_0`, `q5_1`, `q8_0`) are also available for reduced size.
|
|
|
|
### `cli.stt-ptt.notifyTimeout`
|
|
|
|
Notification timeout in milliseconds for the recording indicator.
|
|
|
|
- Type: `integer`
|
|
- Default: `3000`
|
|
- Example: `5000` (5 seconds), `0` (persistent)
|
|
|
|
### `cli.stt-ptt.language`
|
|
|
|
Language for speech recognition. Use "auto" for automatic language detection, or specify a language code for better accuracy.
|
|
|
|
- Type: `enum ["auto", "en", "es", "fr", "de", "it", "pt", "ru", "zh", "ja", "ko", "ar", "hi", "tr", "pl", "nl", "sv", "da", "fi", "no", "vi", "th", "id", "uk", "cs"]`
|
|
- Default: `"auto"`
|
|
|
|
**Auto-detection**: When set to "auto", whisper.cpp analyzes the audio to determine the spoken language automatically.
|
|
|
|
**Language specification**: Specifying a language code improves transcription accuracy if you know the language in advance.
|
|
|
|
```nix
|
|
# Automatic language detection (default)
|
|
language = "auto";
|
|
|
|
# Force English transcription
|
|
language = "en";
|
|
|
|
# Spanish transcription
|
|
language = "es";
|
|
```
|
|
|
|
**Common language codes:**
|
|
|
|
| Code | Language |
|
|
|------|----------|
|
|
| `en` | English |
|
|
| `es` | Spanish |
|
|
| `fr` | French |
|
|
| `de` | German |
|
|
| `zh` | Chinese |
|
|
| `ja` | Japanese |
|
|
| `ko` | Korean |
|
|
|
|
whisper.cpp supports 100+ languages. See whisper.cpp documentation for the full list.
|
|
|
|
## Usage
|
|
|
|
After enabling, bind `stt-ptt start` and `stt-ptt stop` to a key:
|
|
|
|
```bash
|
|
# Start recording
|
|
stt-ptt start
|
|
|
|
# Stop recording and transcribe (types result)
|
|
stt-ptt stop
|
|
```
|
|
|
|
### Keybinding Examples
|
|
|
|
#### Hyprland
|
|
|
|
```nix
|
|
wayland.windowManager.hyprland.settings = {
|
|
bind = [
|
|
"SUPER, V, exec, stt-ptt start"
|
|
];
|
|
bindr = [
|
|
"SUPER, V, exec, stt-ptt stop"
|
|
];
|
|
};
|
|
```
|
|
|
|
Or in `hyprland.conf`:
|
|
|
|
```conf
|
|
# Press to start recording, release to transcribe
|
|
bind = SUPER, V, exec, stt-ptt start
|
|
bindr = SUPER, V, exec, stt-ptt stop
|
|
```
|
|
|
|
#### Sway
|
|
|
|
```conf
|
|
bindsym --no-repeat $mod+v exec stt-ptt start
|
|
bindsym --release $mod+v exec stt-ptt stop
|
|
```
|
|
|
|
## Configuration Examples
|
|
|
|
### Basic Setup
|
|
|
|
```nix
|
|
cli.stt-ptt = {
|
|
enable = true;
|
|
};
|
|
```
|
|
|
|
### Fast English Transcription
|
|
|
|
```nix
|
|
cli.stt-ptt = {
|
|
enable = true;
|
|
model = "ggml-base.en";
|
|
notifyTimeout = 2000;
|
|
};
|
|
```
|
|
|
|
### Language-Specific Transcription
|
|
|
|
```nix
|
|
cli.stt-ptt = {
|
|
enable = true;
|
|
model = "ggml-large-v3-turbo";
|
|
language = "es"; # Force Spanish transcription
|
|
};
|
|
```
|
|
|
|
### High Quality with NVIDIA GPU
|
|
|
|
```nix
|
|
cli.stt-ptt = {
|
|
enable = true;
|
|
model = "ggml-large-v3";
|
|
whisperPackage = pkgs.whisper-cpp.override { cudaSupport = true; };
|
|
};
|
|
```
|
|
|
|
### Vulkan GPU Acceleration
|
|
|
|
```nix
|
|
cli.stt-ptt = {
|
|
enable = true;
|
|
model = "ggml-large-v3-turbo";
|
|
whisperPackage = pkgs.whisper-cpp-vulkan;
|
|
};
|
|
```
|
|
|
|
### AMD GPU with ROCm
|
|
|
|
```nix
|
|
cli.stt-ptt = {
|
|
enable = true;
|
|
model = "ggml-large-v3-turbo";
|
|
whisperPackage = pkgs.whisper-cpp.override { rocmSupport = true; };
|
|
};
|
|
```
|
|
|
|
### Balanced Setup
|
|
|
|
```nix
|
|
cli.stt-ptt = {
|
|
enable = true;
|
|
model = "ggml-small";
|
|
notifyTimeout = 3000;
|
|
};
|
|
```
|
|
|
|
## File Locations
|
|
|
|
| Path | Description |
|
|
|------|-------------|
|
|
| `~/.local/share/stt-ptt/models/` | Downloaded Whisper models |
|
|
| `~/.cache/stt-ptt/stt.wav` | Temporary audio recording |
|
|
| `~/.cache/stt-ptt/stt.pid` | PID file for recording process |
|
|
|
|
## Environment Variables
|
|
|
|
The module sets these automatically:
|
|
|
|
| Variable | Value |
|
|
|----------|-------|
|
|
| `STT_MODEL` | `~/.local/share/stt-ptt/models/<model>.bin` |
|
|
| `STT_LANGUAGE` | Configured language ("auto" by default) |
|
|
| `STT_NOTIFY_TIMEOUT` | Configured timeout in ms |
|
|
|
|
## Requirements
|
|
|
|
- Wayland compositor (wtype is Wayland-only)
|
|
- PipeWire for audio recording
|
|
- Desktop notification daemon
|
|
|
|
## Troubleshooting
|
|
|
|
### Model Download Failed
|
|
|
|
The model downloads on first `home-manager switch`. If it fails:
|
|
|
|
```bash
|
|
# Manual download
|
|
mkdir -p ~/.local/share/stt-ptt/models
|
|
curl -L -o ~/.local/share/stt-ptt/models/ggml-large-v3-turbo.bin \
|
|
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin
|
|
```
|
|
|
|
### Transcription Too Slow
|
|
|
|
Use a smaller model or enable GPU acceleration:
|
|
|
|
```nix
|
|
cli.stt-ptt = {
|
|
enable = true;
|
|
model = "ggml-tiny.en"; # Much faster
|
|
};
|
|
```
|
|
|
|
### Text Not Appearing
|
|
|
|
1. Ensure you're on Wayland: `echo $XDG_SESSION_TYPE`
|
|
2. Check if wtype works: `wtype "test"`
|
|
3. Some apps may need focus; try clicking the text field first
|
|
|
|
## Related
|
|
|
|
- [stt-ptt Package](../../../packages/stt-ptt.md) - Package documentation
|
|
- [Using Modules Guide](../../../guides/using-modules.md) - Module usage patterns
|