feat: add stt-ptt package
This commit is contained in:
@@ -46,6 +46,7 @@ nix run git+https://code.m3ta.dev/m3tam3re/nixpkgs#zellij-ps
|
|||||||
| `mem0` | AI memory assistant with vector storage |
|
| `mem0` | AI memory assistant with vector storage |
|
||||||
| `msty-studio` | Msty Studio application |
|
| `msty-studio` | Msty Studio application |
|
||||||
| `pomodoro-timer` | Pomodoro timer utility |
|
| `pomodoro-timer` | Pomodoro timer utility |
|
||||||
|
| `stt-ptt` | Push to Talk Speech to Text |
|
||||||
| `tuxedo-backlight` | Backlight control for Tuxedo laptops |
|
| `tuxedo-backlight` | Backlight control for Tuxedo laptops |
|
||||||
| `zellij-ps` | Project switcher for Zellij |
|
| `zellij-ps` | Project switcher for Zellij |
|
||||||
|
|
||||||
|
|||||||
@@ -34,6 +34,7 @@ Documentation for all custom packages:
|
|||||||
- [mem0](./packages/mem0.md) - AI memory assistant with vector storage
|
- [mem0](./packages/mem0.md) - AI memory assistant with vector storage
|
||||||
- [msty-studio](./packages/msty-studio.md) - Msty Studio application
|
- [msty-studio](./packages/msty-studio.md) - Msty Studio application
|
||||||
- [pomodoro-timer](./packages/pomodoro-timer.md) - Pomodoro timer utility
|
- [pomodoro-timer](./packages/pomodoro-timer.md) - Pomodoro timer utility
|
||||||
|
- [stt-ptt](./packages/stt-ptt.md) - Push to Talk Speech to Text using Whisper
|
||||||
- [tuxedo-backlight](./packages/tuxedo-backlight.md) - Backlight control for Tuxedo laptops
|
- [tuxedo-backlight](./packages/tuxedo-backlight.md) - Backlight control for Tuxedo laptops
|
||||||
- [zellij-ps](./packages/zellij-ps.md) - Project switcher for Zellij
|
- [zellij-ps](./packages/zellij-ps.md) - Project switcher for Zellij
|
||||||
|
|
||||||
@@ -49,6 +50,7 @@ Configuration modules for NixOS and Home Manager:
|
|||||||
#### Home Manager Modules
|
#### Home Manager Modules
|
||||||
- [Overview](./modules/home-manager/overview.md) - Home Manager modules overview
|
- [Overview](./modules/home-manager/overview.md) - Home Manager modules overview
|
||||||
- [CLI Tools](./modules/home-manager/cli/) - CLI-related modules
|
- [CLI Tools](./modules/home-manager/cli/) - CLI-related modules
|
||||||
|
- [stt-ptt](./modules/home-manager/cli/stt-ptt.md) - Push to Talk Speech to Text
|
||||||
- [zellij-ps](./modules/home-manager/cli/zellij-ps.md) - Zellij project switcher
|
- [zellij-ps](./modules/home-manager/cli/zellij-ps.md) - Zellij project switcher
|
||||||
- [Coding](./modules/home-manager/coding/) - Development-related modules
|
- [Coding](./modules/home-manager/coding/) - Development-related modules
|
||||||
- [editors](./modules/home-manager/coding/editors.md) - Editor configurations
|
- [editors](./modules/home-manager/coding/editors.md) - Editor configurations
|
||||||
|
|||||||
265
docs/modules/home-manager/cli/stt-ptt.md
Normal file
265
docs/modules/home-manager/cli/stt-ptt.md
Normal file
@@ -0,0 +1,265 @@
|
|||||||
|
# stt-ptt Home Manager Module
|
||||||
|
|
||||||
|
Push to Talk Speech to Text for Home Manager.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
This module configures stt-ptt, a push-to-talk speech-to-text tool using whisper.cpp. It handles model downloads, environment configuration, and package installation.
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
```nix
|
||||||
|
{config, ...}: {
|
||||||
|
imports = [m3ta-nixpkgs.homeManagerModules.default];
|
||||||
|
|
||||||
|
cli.stt-ptt = {
|
||||||
|
enable = true;
|
||||||
|
};
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This will:
|
||||||
|
- Install stt-ptt with default whisper-cpp
|
||||||
|
- Download the `ggml-large-v3-turbo` model on first activation
|
||||||
|
- Set environment variables for model path and notification timeout
|
||||||
|
|
||||||
|
## Module Options
|
||||||
|
|
||||||
|
### `cli.stt-ptt.enable`
|
||||||
|
|
||||||
|
Enable the stt-ptt module.
|
||||||
|
|
||||||
|
- Type: `boolean`
|
||||||
|
- Default: `false`
|
||||||
|
|
||||||
|
### `cli.stt-ptt.whisperPackage`
|
||||||
|
|
||||||
|
The whisper-cpp package to use for transcription.
|
||||||
|
|
||||||
|
- Type: `package`
|
||||||
|
- Default: `pkgs.whisper-cpp`
|
||||||
|
|
||||||
|
**Pre-built variants:**
|
||||||
|
|
||||||
|
```nix
|
||||||
|
# CPU (default)
|
||||||
|
whisperPackage = pkgs.whisper-cpp;
|
||||||
|
|
||||||
|
# Vulkan GPU acceleration (pre-built)
|
||||||
|
whisperPackage = pkgs.whisper-cpp-vulkan;
|
||||||
|
```
|
||||||
|
|
||||||
|
**Override options** (can be combined):
|
||||||
|
|
||||||
|
| Option | Description |
|
||||||
|
|--------|-------------|
|
||||||
|
| `cudaSupport` | NVIDIA CUDA acceleration |
|
||||||
|
| `rocmSupport` | AMD ROCm acceleration |
|
||||||
|
| `vulkanSupport` | Vulkan GPU acceleration |
|
||||||
|
| `coreMLSupport` | Apple CoreML (macOS only) |
|
||||||
|
| `metalSupport` | Apple Metal (macOS ARM only) |
|
||||||
|
|
||||||
|
```nix
|
||||||
|
# NVIDIA CUDA support
|
||||||
|
whisperPackage = pkgs.whisper-cpp.override { cudaSupport = true; };
|
||||||
|
|
||||||
|
# AMD ROCm support
|
||||||
|
whisperPackage = pkgs.whisper-cpp.override { rocmSupport = true; };
|
||||||
|
|
||||||
|
# Vulkan support (manual override)
|
||||||
|
whisperPackage = pkgs.whisper-cpp.override { vulkanSupport = true; };
|
||||||
|
```
|
||||||
|
|
||||||
|
### `cli.stt-ptt.model`
|
||||||
|
|
||||||
|
The Whisper model to use. Models are automatically downloaded from HuggingFace on first activation.
|
||||||
|
|
||||||
|
- Type: `string`
|
||||||
|
- Default: `"ggml-large-v3-turbo"`
|
||||||
|
|
||||||
|
Available models (sorted by size):
|
||||||
|
|
||||||
|
| Model | Size | Notes |
|
||||||
|
|-------|------|-------|
|
||||||
|
| `ggml-tiny` | 75MB | Fastest, lowest quality |
|
||||||
|
| `ggml-tiny.en` | 75MB | English-only, slightly faster |
|
||||||
|
| `ggml-base` | 142MB | Fast, basic quality |
|
||||||
|
| `ggml-base.en` | 142MB | English-only |
|
||||||
|
| `ggml-small` | 466MB | Balanced speed/quality |
|
||||||
|
| `ggml-small.en` | 466MB | English-only |
|
||||||
|
| `ggml-medium` | 1.5GB | Good quality |
|
||||||
|
| `ggml-medium.en` | 1.5GB | English-only |
|
||||||
|
| `ggml-large-v1` | 2.9GB | High quality (original) |
|
||||||
|
| `ggml-large-v2` | 2.9GB | High quality (improved) |
|
||||||
|
| `ggml-large-v3` | 2.9GB | Highest quality |
|
||||||
|
| `ggml-large-v3-turbo` | 1.6GB | High quality, optimized speed (recommended) |
|
||||||
|
|
||||||
|
Quantized versions (`q5_0`, `q5_1`, `q8_0`) are also available for reduced size.
|
||||||
|
|
||||||
|
### `cli.stt-ptt.notifyTimeout`
|
||||||
|
|
||||||
|
Notification timeout in milliseconds for the recording indicator.
|
||||||
|
|
||||||
|
- Type: `integer`
|
||||||
|
- Default: `3000`
|
||||||
|
- Example: `5000` (5 seconds), `0` (persistent)
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
After enabling, bind `stt-ptt start` and `stt-ptt stop` to a key:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start recording
|
||||||
|
stt-ptt start
|
||||||
|
|
||||||
|
# Stop recording and transcribe (types result)
|
||||||
|
stt-ptt stop
|
||||||
|
```
|
||||||
|
|
||||||
|
### Keybinding Examples
|
||||||
|
|
||||||
|
#### Hyprland
|
||||||
|
|
||||||
|
```nix
|
||||||
|
wayland.windowManager.hyprland.settings = {
|
||||||
|
bind = [
|
||||||
|
"SUPER, V, exec, stt-ptt start"
|
||||||
|
];
|
||||||
|
bindr = [
|
||||||
|
"SUPER, V, exec, stt-ptt stop"
|
||||||
|
];
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
Or in `hyprland.conf`:
|
||||||
|
|
||||||
|
```conf
|
||||||
|
# Press to start recording, release to transcribe
|
||||||
|
bind = SUPER, V, exec, stt-ptt start
|
||||||
|
bindr = SUPER, V, exec, stt-ptt stop
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Sway
|
||||||
|
|
||||||
|
```conf
|
||||||
|
bindsym --no-repeat $mod+v exec stt-ptt start
|
||||||
|
bindsym --release $mod+v exec stt-ptt stop
|
||||||
|
```
|
||||||
|
|
||||||
|
## Configuration Examples
|
||||||
|
|
||||||
|
### Basic Setup
|
||||||
|
|
||||||
|
```nix
|
||||||
|
cli.stt-ptt = {
|
||||||
|
enable = true;
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
### Fast English Transcription
|
||||||
|
|
||||||
|
```nix
|
||||||
|
cli.stt-ptt = {
|
||||||
|
enable = true;
|
||||||
|
model = "ggml-base.en";
|
||||||
|
notifyTimeout = 2000;
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
### High Quality with NVIDIA GPU
|
||||||
|
|
||||||
|
```nix
|
||||||
|
cli.stt-ptt = {
|
||||||
|
enable = true;
|
||||||
|
model = "ggml-large-v3";
|
||||||
|
whisperPackage = pkgs.whisper-cpp.override { cudaSupport = true; };
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
### Vulkan GPU Acceleration
|
||||||
|
|
||||||
|
```nix
|
||||||
|
cli.stt-ptt = {
|
||||||
|
enable = true;
|
||||||
|
model = "ggml-large-v3-turbo";
|
||||||
|
whisperPackage = pkgs.whisper-cpp-vulkan;
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
### AMD GPU with ROCm
|
||||||
|
|
||||||
|
```nix
|
||||||
|
cli.stt-ptt = {
|
||||||
|
enable = true;
|
||||||
|
model = "ggml-large-v3-turbo";
|
||||||
|
whisperPackage = pkgs.whisper-cpp.override { rocmSupport = true; };
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
### Balanced Setup
|
||||||
|
|
||||||
|
```nix
|
||||||
|
cli.stt-ptt = {
|
||||||
|
enable = true;
|
||||||
|
model = "ggml-small";
|
||||||
|
notifyTimeout = 3000;
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
## File Locations
|
||||||
|
|
||||||
|
| Path | Description |
|
||||||
|
|------|-------------|
|
||||||
|
| `~/.local/share/stt-ptt/models/` | Downloaded Whisper models |
|
||||||
|
| `~/.cache/stt-ptt/stt.wav` | Temporary audio recording |
|
||||||
|
| `~/.cache/stt-ptt/stt.pid` | PID file for recording process |
|
||||||
|
|
||||||
|
## Environment Variables
|
||||||
|
|
||||||
|
The module sets these automatically:
|
||||||
|
|
||||||
|
| Variable | Value |
|
||||||
|
|----------|-------|
|
||||||
|
| `STT_MODEL` | `~/.local/share/stt-ptt/models/<model>.bin` |
|
||||||
|
| `STT_NOTIFY_TIMEOUT` | Configured timeout in ms |
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
- Wayland compositor (wtype is Wayland-only)
|
||||||
|
- PipeWire for audio recording
|
||||||
|
- Desktop notification daemon
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Model Download Failed
|
||||||
|
|
||||||
|
The model downloads on first `home-manager switch`. If it fails:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Manual download
|
||||||
|
mkdir -p ~/.local/share/stt-ptt/models
|
||||||
|
curl -L -o ~/.local/share/stt-ptt/models/ggml-large-v3-turbo.bin \
|
||||||
|
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin
|
||||||
|
```
|
||||||
|
|
||||||
|
### Transcription Too Slow
|
||||||
|
|
||||||
|
Use a smaller model or enable GPU acceleration:
|
||||||
|
|
||||||
|
```nix
|
||||||
|
cli.stt-ptt = {
|
||||||
|
enable = true;
|
||||||
|
model = "ggml-tiny.en"; # Much faster
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
### Text Not Appearing
|
||||||
|
|
||||||
|
1. Ensure you're on Wayland: `echo $XDG_SESSION_TYPE`
|
||||||
|
2. Check if wtype works: `wtype "test"`
|
||||||
|
3. Some apps may need focus; try clicking the text field first
|
||||||
|
|
||||||
|
## Related
|
||||||
|
|
||||||
|
- [stt-ptt Package](../../../packages/stt-ptt.md) - Package documentation
|
||||||
|
- [Using Modules Guide](../../../guides/using-modules.md) - Module usage patterns
|
||||||
202
docs/packages/stt-ptt.md
Normal file
202
docs/packages/stt-ptt.md
Normal file
@@ -0,0 +1,202 @@
|
|||||||
|
# stt-ptt
|
||||||
|
|
||||||
|
Push to Talk Speech to Text using Whisper.
|
||||||
|
|
||||||
|
## Description
|
||||||
|
|
||||||
|
stt-ptt is a simple push-to-talk speech-to-text tool that uses whisper.cpp for transcription. It records audio via PipeWire, transcribes it using a local Whisper model, and types the result using wtype (Wayland).
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- **Push to Talk**: Start/stop recording with simple commands
|
||||||
|
- **Local Processing**: Uses whisper.cpp for fast, offline transcription
|
||||||
|
- **Wayland Native**: Types transcribed text using wtype
|
||||||
|
- **Configurable**: Model path and notification timeout via environment variables
|
||||||
|
- **Lightweight**: Minimal dependencies, no cloud services
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
### Via Home Manager Module (Recommended)
|
||||||
|
|
||||||
|
See [stt-ptt Home Manager Module](../modules/home-manager/cli/stt-ptt.md) for the recommended setup with automatic model download.
|
||||||
|
|
||||||
|
### Via Overlay
|
||||||
|
|
||||||
|
```nix
|
||||||
|
{pkgs, ...}: {
|
||||||
|
home.packages = [pkgs.stt-ptt];
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Direct Reference
|
||||||
|
|
||||||
|
```nix
|
||||||
|
{pkgs, ...}: {
|
||||||
|
home.packages = [
|
||||||
|
inputs.m3ta-nixpkgs.packages.${pkgs.system}.stt-ptt
|
||||||
|
];
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
### Basic Commands
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start recording
|
||||||
|
stt-ptt start
|
||||||
|
|
||||||
|
# Stop recording and transcribe
|
||||||
|
stt-ptt stop
|
||||||
|
```
|
||||||
|
|
||||||
|
### Keybinding Setup
|
||||||
|
|
||||||
|
The tool is designed to be bound to a key (e.g., hold to record, release to transcribe).
|
||||||
|
|
||||||
|
#### Hyprland
|
||||||
|
|
||||||
|
```nix
|
||||||
|
# In your Hyprland config
|
||||||
|
wayland.windowManager.hyprland.settings = {
|
||||||
|
bind = [
|
||||||
|
# Press Super+V to start, release to stop and transcribe
|
||||||
|
"SUPER, V, exec, stt-ptt start"
|
||||||
|
];
|
||||||
|
bindr = [
|
||||||
|
# Release trigger
|
||||||
|
"SUPER, V, exec, stt-ptt stop"
|
||||||
|
];
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
Or in `hyprland.conf`:
|
||||||
|
|
||||||
|
```conf
|
||||||
|
bind = SUPER, V, exec, stt-ptt start
|
||||||
|
bindr = SUPER, V, exec, stt-ptt stop
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Sway
|
||||||
|
|
||||||
|
```conf
|
||||||
|
# Hold to record, release to transcribe
|
||||||
|
bindsym --no-repeat $mod+v exec stt-ptt start
|
||||||
|
bindsym --release $mod+v exec stt-ptt stop
|
||||||
|
```
|
||||||
|
|
||||||
|
#### i3 (X11 - requires xdotool instead of wtype)
|
||||||
|
|
||||||
|
Note: stt-ptt uses wtype which is Wayland-only. For X11, you would need to modify the script to use xdotool.
|
||||||
|
|
||||||
|
### Environment Variables
|
||||||
|
|
||||||
|
| Variable | Description | Default |
|
||||||
|
|----------|-------------|---------|
|
||||||
|
| `STT_MODEL` | Path to Whisper model file | `~/.local/share/stt-ptt/models/ggml-large-v3-turbo.bin` |
|
||||||
|
| `STT_NOTIFY_TIMEOUT` | Notification timeout in ms | `3000` |
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
- **whisper-cpp**: Speech recognition engine
|
||||||
|
- **wtype**: Wayland text input (Wayland compositor required)
|
||||||
|
- **libnotify**: Desktop notifications
|
||||||
|
- **pipewire**: Audio recording
|
||||||
|
|
||||||
|
## Model Setup
|
||||||
|
|
||||||
|
Download a Whisper model from [HuggingFace](https://huggingface.co/ggerganov/whisper.cpp/tree/main):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Create model directory
|
||||||
|
mkdir -p ~/.local/share/stt-ptt/models
|
||||||
|
|
||||||
|
# Download model (example: large-v3-turbo)
|
||||||
|
curl -L -o ~/.local/share/stt-ptt/models/ggml-large-v3-turbo.bin \
|
||||||
|
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin
|
||||||
|
```
|
||||||
|
|
||||||
|
Or use the Home Manager module which handles this automatically.
|
||||||
|
|
||||||
|
## Available Models
|
||||||
|
|
||||||
|
| Model | Size | Quality | Speed |
|
||||||
|
|-------|------|---------|-------|
|
||||||
|
| `ggml-tiny` / `ggml-tiny.en` | 75MB | Basic | Fastest |
|
||||||
|
| `ggml-base` / `ggml-base.en` | 142MB | Good | Fast |
|
||||||
|
| `ggml-small` / `ggml-small.en` | 466MB | Better | Medium |
|
||||||
|
| `ggml-medium` / `ggml-medium.en` | 1.5GB | High | Slower |
|
||||||
|
| `ggml-large-v3-turbo` | 1.6GB | High | Fast |
|
||||||
|
| `ggml-large-v3` | 2.9GB | Highest | Slowest |
|
||||||
|
|
||||||
|
Models ending in `.en` are English-only and slightly faster for English text.
|
||||||
|
|
||||||
|
## Platform Support
|
||||||
|
|
||||||
|
- Linux with Wayland (primary)
|
||||||
|
- Requires PipeWire for audio
|
||||||
|
- X11 not supported (wtype is Wayland-only)
|
||||||
|
|
||||||
|
## Build Information
|
||||||
|
|
||||||
|
- **Version**: 0.1.0
|
||||||
|
- **Type**: Shell script wrapper
|
||||||
|
- **License**: MIT
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Model Not Found
|
||||||
|
|
||||||
|
Error: `Error: Model not found at /path/to/model`
|
||||||
|
|
||||||
|
**Solution**: Download a model or use the Home Manager module:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -L -o ~/.local/share/stt-ptt/models/ggml-large-v3-turbo.bin \
|
||||||
|
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin
|
||||||
|
```
|
||||||
|
|
||||||
|
### No Audio Recorded
|
||||||
|
|
||||||
|
**Solution**: Ensure PipeWire is running:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
systemctl --user status pipewire
|
||||||
|
```
|
||||||
|
|
||||||
|
### Text Not Typed
|
||||||
|
|
||||||
|
**Solution**: Ensure you're on Wayland and wtype has access:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check if running on Wayland
|
||||||
|
echo $XDG_SESSION_TYPE # Should print "wayland"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Slow Transcription
|
||||||
|
|
||||||
|
**Solution**: Use a smaller model or enable GPU acceleration:
|
||||||
|
|
||||||
|
```nix
|
||||||
|
cli.stt-ptt = {
|
||||||
|
enable = true;
|
||||||
|
model = "ggml-base.en"; # Smaller, faster model
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
Or with GPU acceleration:
|
||||||
|
|
||||||
|
```nix
|
||||||
|
cli.stt-ptt = {
|
||||||
|
enable = true;
|
||||||
|
# Choose one:
|
||||||
|
whisperPackage = pkgs.whisper-cpp-vulkan; # Vulkan (pre-built)
|
||||||
|
# whisperPackage = pkgs.whisper-cpp.override { cudaSupport = true; }; # NVIDIA
|
||||||
|
# whisperPackage = pkgs.whisper-cpp.override { rocmSupport = true; }; # AMD
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
## Related
|
||||||
|
|
||||||
|
- [stt-ptt Home Manager Module](../modules/home-manager/cli/stt-ptt.md) - Module documentation
|
||||||
|
- [Adding Packages](../guides/adding-packages.md) - How to add new packages
|
||||||
@@ -1,6 +1,7 @@
|
|||||||
# CLI/Terminal-related Home Manager modules
|
# CLI/Terminal-related Home Manager modules
|
||||||
{
|
{
|
||||||
imports = [
|
imports = [
|
||||||
|
./stt-ptt.nix
|
||||||
./zellij-ps.nix
|
./zellij-ps.nix
|
||||||
];
|
];
|
||||||
}
|
}
|
||||||
|
|||||||
107
modules/home-manager/cli/stt-ptt.nix
Normal file
107
modules/home-manager/cli/stt-ptt.nix
Normal file
@@ -0,0 +1,107 @@
|
|||||||
|
{
|
||||||
|
config,
|
||||||
|
lib,
|
||||||
|
pkgs,
|
||||||
|
...
|
||||||
|
}:
|
||||||
|
with lib; let
|
||||||
|
cfg = config.cli.stt-ptt;
|
||||||
|
|
||||||
|
# Build stt-ptt package with the selected whisper package
|
||||||
|
sttPttPackage = pkgs.stt-ptt.override {
|
||||||
|
whisper-cpp = cfg.whisperPackage;
|
||||||
|
};
|
||||||
|
|
||||||
|
modelDir = "${config.xdg.dataHome}/stt-ptt/models";
|
||||||
|
modelPath = "${modelDir}/${cfg.model}.bin";
|
||||||
|
|
||||||
|
# HuggingFace URL for whisper.cpp models
|
||||||
|
modelUrl = "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/${cfg.model}.bin";
|
||||||
|
in {
|
||||||
|
options.cli.stt-ptt = {
|
||||||
|
enable = mkEnableOption "Push to Talk Speech to Text using Whisper";
|
||||||
|
|
||||||
|
whisperPackage = mkOption {
|
||||||
|
type = types.package;
|
||||||
|
default = pkgs.whisper-cpp;
|
||||||
|
description = ''
|
||||||
|
The whisper-cpp package to use. Available options:
|
||||||
|
|
||||||
|
Pre-built variants:
|
||||||
|
- `pkgs.whisper-cpp` - CPU-based inference (default)
|
||||||
|
- `pkgs.whisper-cpp-vulkan` - Vulkan GPU acceleration
|
||||||
|
|
||||||
|
Override options (can be combined):
|
||||||
|
- `cudaSupport` - NVIDIA CUDA support
|
||||||
|
- `rocmSupport` - AMD ROCm support
|
||||||
|
- `vulkanSupport` - Vulkan support
|
||||||
|
- `coreMLSupport` - Apple CoreML (macOS only)
|
||||||
|
- `metalSupport` - Apple Metal (macOS ARM only)
|
||||||
|
|
||||||
|
Example overrides:
|
||||||
|
- `pkgs.whisper-cpp.override { cudaSupport = true; }` - NVIDIA GPU
|
||||||
|
- `pkgs.whisper-cpp.override { rocmSupport = true; }` - AMD GPU
|
||||||
|
- `pkgs.whisper-cpp.override { vulkanSupport = true; }` - Vulkan
|
||||||
|
'';
|
||||||
|
example = literalExpression "pkgs.whisper-cpp.override { cudaSupport = true; }";
|
||||||
|
};
|
||||||
|
|
||||||
|
model = mkOption {
|
||||||
|
type = types.str;
|
||||||
|
default = "ggml-large-v3-turbo";
|
||||||
|
description = ''
|
||||||
|
The Whisper model to use. Models are downloaded from HuggingFace.
|
||||||
|
|
||||||
|
Available models (sorted by size/quality):
|
||||||
|
- `ggml-tiny` / `ggml-tiny.en` - 75MB, fastest, lowest quality
|
||||||
|
- `ggml-base` / `ggml-base.en` - 142MB, fast, basic quality
|
||||||
|
- `ggml-small` / `ggml-small.en` - 466MB, balanced
|
||||||
|
- `ggml-medium` / `ggml-medium.en` - 1.5GB, good quality
|
||||||
|
- `ggml-large-v1` - 2.9GB, high quality (original)
|
||||||
|
- `ggml-large-v2` - 2.9GB, high quality (improved)
|
||||||
|
- `ggml-large-v3` - 2.9GB, highest quality
|
||||||
|
- `ggml-large-v3-turbo` - 1.6GB, high quality, optimized speed (recommended)
|
||||||
|
|
||||||
|
Models ending in `.en` are English-only and slightly faster for English.
|
||||||
|
Quantized versions (q5_0, q5_1, q8_0) are also available for reduced size.
|
||||||
|
'';
|
||||||
|
example = "ggml-base.en";
|
||||||
|
};
|
||||||
|
|
||||||
|
notifyTimeout = mkOption {
|
||||||
|
type = types.int;
|
||||||
|
default = 3000;
|
||||||
|
description = ''
|
||||||
|
Notification timeout in milliseconds for the recording indicator.
|
||||||
|
Set to 0 for persistent notifications.
|
||||||
|
'';
|
||||||
|
example = 5000;
|
||||||
|
};
|
||||||
|
};
|
||||||
|
|
||||||
|
config = mkIf cfg.enable {
|
||||||
|
home.packages = [sttPttPackage];
|
||||||
|
|
||||||
|
home.sessionVariables = {
|
||||||
|
STT_MODEL = modelPath;
|
||||||
|
STT_NOTIFY_TIMEOUT = toString cfg.notifyTimeout;
|
||||||
|
};
|
||||||
|
|
||||||
|
# Create model directory and download model if not present
|
||||||
|
home.activation.downloadWhisperModel = lib.hm.dag.entryAfter ["writeBoundary"] ''
|
||||||
|
MODEL_DIR="${modelDir}"
|
||||||
|
MODEL_PATH="${modelPath}"
|
||||||
|
MODEL_URL="${modelUrl}"
|
||||||
|
|
||||||
|
$DRY_RUN_CMD mkdir -p "$MODEL_DIR"
|
||||||
|
|
||||||
|
if [ ! -f "$MODEL_PATH" ]; then
|
||||||
|
echo "Downloading Whisper model: ${cfg.model}..."
|
||||||
|
$DRY_RUN_CMD ${pkgs.curl}/bin/curl -L -o "$MODEL_PATH" "$MODEL_URL" || {
|
||||||
|
echo "Failed to download model from $MODEL_URL"
|
||||||
|
echo "Please download manually and place at: $MODEL_PATH"
|
||||||
|
}
|
||||||
|
fi
|
||||||
|
'';
|
||||||
|
};
|
||||||
|
}
|
||||||
@@ -7,6 +7,7 @@
|
|||||||
mem0 = pkgs.callPackage ./mem0 {};
|
mem0 = pkgs.callPackage ./mem0 {};
|
||||||
msty-studio = pkgs.callPackage ./msty-studio {};
|
msty-studio = pkgs.callPackage ./msty-studio {};
|
||||||
pomodoro-timer = pkgs.callPackage ./pomodoro-timer {};
|
pomodoro-timer = pkgs.callPackage ./pomodoro-timer {};
|
||||||
|
stt-ptt = pkgs.callPackage ./stt-ptt {};
|
||||||
tuxedo-backlight = pkgs.callPackage ./tuxedo-backlight {};
|
tuxedo-backlight = pkgs.callPackage ./tuxedo-backlight {};
|
||||||
zellij-ps = pkgs.callPackage ./zellij-ps {};
|
zellij-ps = pkgs.callPackage ./zellij-ps {};
|
||||||
}
|
}
|
||||||
|
|||||||
91
pkgs/stt-ptt/default.nix
Normal file
91
pkgs/stt-ptt/default.nix
Normal file
@@ -0,0 +1,91 @@
|
|||||||
|
{
|
||||||
|
lib,
|
||||||
|
stdenv,
|
||||||
|
writeShellScriptBin,
|
||||||
|
whisper-cpp,
|
||||||
|
wtype,
|
||||||
|
libnotify,
|
||||||
|
pipewire,
|
||||||
|
busybox,
|
||||||
|
}: let
|
||||||
|
script = writeShellScriptBin "stt-ptt" ''
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
# stt-ptt - Push to Talk Speech to Text
|
||||||
|
|
||||||
|
CACHE_DIR="''${XDG_CACHE_HOME:-$HOME/.cache}/stt-ptt"
|
||||||
|
MODEL_DIR="''${XDG_DATA_HOME:-$HOME/.local/share}/stt-ptt/models"
|
||||||
|
AUDIO="$CACHE_DIR/stt.wav"
|
||||||
|
PID_FILE="$CACHE_DIR/stt.pid"
|
||||||
|
|
||||||
|
# Configurable via environment
|
||||||
|
STT_MODEL="''${STT_MODEL:-$MODEL_DIR/ggml-large-v3-turbo.bin}"
|
||||||
|
STT_NOTIFY_TIMEOUT="''${STT_NOTIFY_TIMEOUT:-3000}"
|
||||||
|
|
||||||
|
NOTIFY="${libnotify}/bin/notify-send"
|
||||||
|
PW_RECORD="${pipewire}/bin/pw-record"
|
||||||
|
WHISPER="${whisper-cpp}/bin/whisper-cli"
|
||||||
|
WTYPE="${wtype}/bin/wtype"
|
||||||
|
MKDIR="${busybox}/bin/mkdir"
|
||||||
|
RM="${busybox}/bin/rm"
|
||||||
|
CAT="${busybox}/bin/cat"
|
||||||
|
KILL="${busybox}/bin/kill"
|
||||||
|
TR="${busybox}/bin/tr"
|
||||||
|
SED="${busybox}/bin/sed"
|
||||||
|
|
||||||
|
# Ensure cache directory exists
|
||||||
|
"$MKDIR" -p "$CACHE_DIR"
|
||||||
|
|
||||||
|
case "''${1:-}" in
|
||||||
|
start)
|
||||||
|
"$RM" -f "$AUDIO" "$PID_FILE"
|
||||||
|
"$NOTIFY" -t "$STT_NOTIFY_TIMEOUT" -a "stt-ptt" "Recording..."
|
||||||
|
"$PW_RECORD" --rate=16000 --channels=1 "$AUDIO" &
|
||||||
|
echo $! > "$PID_FILE"
|
||||||
|
;;
|
||||||
|
stop)
|
||||||
|
[[ -f "$PID_FILE" ]] && "$KILL" "$("$CAT" "$PID_FILE")" 2>/dev/null
|
||||||
|
"$RM" -f "$PID_FILE"
|
||||||
|
|
||||||
|
if [[ -f "$AUDIO" ]]; then
|
||||||
|
if [[ ! -f "$STT_MODEL" ]]; then
|
||||||
|
"$NOTIFY" -t "$STT_NOTIFY_TIMEOUT" -a "stt-ptt" "Error: Model not found at $STT_MODEL"
|
||||||
|
"$RM" -f "$AUDIO"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
text=$("$WHISPER" -m "$STT_MODEL" -f "$AUDIO" -np -nt 2>/dev/null | "$TR" -d '\n' | "$SED" 's/^[[:space:]]*//;s/[[:space:]]*$//')
|
||||||
|
"$RM" -f "$AUDIO"
|
||||||
|
[[ -n "$text" ]] && "$WTYPE" -- "$text"
|
||||||
|
fi
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
echo "Usage: stt-ptt {start|stop}"
|
||||||
|
echo ""
|
||||||
|
echo "Environment variables:"
|
||||||
|
echo " STT_MODEL - Path to whisper model (default: \$XDG_DATA_HOME/stt-ptt/models/ggml-large-v3-turbo.bin)"
|
||||||
|
echo " STT_NOTIFY_TIMEOUT - Notification timeout in ms (default: 3000)"
|
||||||
|
exit 1
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
'';
|
||||||
|
in
|
||||||
|
stdenv.mkDerivation {
|
||||||
|
pname = "stt-ptt";
|
||||||
|
version = "0.1.0";
|
||||||
|
|
||||||
|
dontUnpack = true;
|
||||||
|
|
||||||
|
# No buildInputs needed - all runtime deps are hardcoded with full nix store paths in the script
|
||||||
|
|
||||||
|
installPhase = ''
|
||||||
|
mkdir -p "$out/bin"
|
||||||
|
ln -s ${script}/bin/stt-ptt "$out/bin/stt-ptt"
|
||||||
|
'';
|
||||||
|
|
||||||
|
meta = with lib; {
|
||||||
|
description = "Push to Talk Speech to Text using Whisper";
|
||||||
|
homepage = "https://code.m3ta.dev/m3tam3re/nixpkgs";
|
||||||
|
license = licenses.mit;
|
||||||
|
platforms = platforms.linux;
|
||||||
|
mainProgram = "stt-ptt";
|
||||||
|
};
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user