m3tam3re/nixpkgs

Fork 0

Files

m3tm3re de1301e08d feat: add stt-ptt package

2026-01-02 12:24:48 +01:00

4.8 KiB

Raw Permalink Blame History

stt-ptt

Push to Talk Speech to Text using Whisper.

Description

stt-ptt is a simple push-to-talk speech-to-text tool that uses whisper.cpp for transcription. It records audio via PipeWire, transcribes it using a local Whisper model, and types the result using wtype (Wayland).

Features

Push to Talk: Start/stop recording with simple commands
Local Processing: Uses whisper.cpp for fast, offline transcription
Wayland Native: Types transcribed text using wtype
Configurable: Model path and notification timeout via environment variables
Lightweight: Minimal dependencies, no cloud services

Installation

Via Home Manager Module (Recommended)

See stt-ptt Home Manager Module for the recommended setup with automatic model download.

Via Overlay

{pkgs, ...}: {
  home.packages = [pkgs.stt-ptt];
}

Direct Reference

{pkgs, ...}: {
  home.packages = [
    inputs.m3ta-nixpkgs.packages.${pkgs.system}.stt-ptt
  ];
}

Usage

Basic Commands

# Start recording
stt-ptt start

# Stop recording and transcribe
stt-ptt stop

Keybinding Setup

The tool is designed to be bound to a key (e.g., hold to record, release to transcribe).

Hyprland

# In your Hyprland config
wayland.windowManager.hyprland.settings = {
  bind = [
    # Press Super+V to start, release to stop and transcribe
    "SUPER, V, exec, stt-ptt start"
  ];
  bindr = [
    # Release trigger
    "SUPER, V, exec, stt-ptt stop"
  ];
};

Or in hyprland.conf:

bind = SUPER, V, exec, stt-ptt start
bindr = SUPER, V, exec, stt-ptt stop

Sway

# Hold to record, release to transcribe
bindsym --no-repeat $mod+v exec stt-ptt start
bindsym --release $mod+v exec stt-ptt stop

i3 (X11 - requires xdotool instead of wtype)

Note: stt-ptt uses wtype which is Wayland-only. For X11, you would need to modify the script to use xdotool.

Environment Variables

Variable	Description	Default
`STT_MODEL`	Path to Whisper model file	`~/.local/share/stt-ptt/models/ggml-large-v3-turbo.bin`
`STT_NOTIFY_TIMEOUT`	Notification timeout in ms	`3000`

Requirements

whisper-cpp: Speech recognition engine
wtype: Wayland text input (Wayland compositor required)
libnotify: Desktop notifications
pipewire: Audio recording

Model Setup

Download a Whisper model from HuggingFace:

# Create model directory
mkdir -p ~/.local/share/stt-ptt/models

# Download model (example: large-v3-turbo)
curl -L -o ~/.local/share/stt-ptt/models/ggml-large-v3-turbo.bin \
  https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin

Or use the Home Manager module which handles this automatically.

Available Models

Model	Size	Quality	Speed
`ggml-tiny` / `ggml-tiny.en`	75MB	Basic	Fastest
`ggml-base` / `ggml-base.en`	142MB	Good	Fast
`ggml-small` / `ggml-small.en`	466MB	Better	Medium
`ggml-medium` / `ggml-medium.en`	1.5GB	High	Slower
`ggml-large-v3-turbo`	1.6GB	High	Fast
`ggml-large-v3`	2.9GB	Highest	Slowest

Models ending in .en are English-only and slightly faster for English text.

Platform Support

Linux with Wayland (primary)
Requires PipeWire for audio
X11 not supported (wtype is Wayland-only)

Build Information

Version: 0.1.0
Type: Shell script wrapper
License: MIT

Troubleshooting

Model Not Found

Error: Error: Model not found at /path/to/model

Solution: Download a model or use the Home Manager module:

curl -L -o ~/.local/share/stt-ptt/models/ggml-large-v3-turbo.bin \
  https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin

No Audio Recorded

Solution: Ensure PipeWire is running:

systemctl --user status pipewire

Text Not Typed

Solution: Ensure you're on Wayland and wtype has access:

# Check if running on Wayland
echo $XDG_SESSION_TYPE  # Should print "wayland"

Slow Transcription

Solution: Use a smaller model or enable GPU acceleration:

cli.stt-ptt = {
  enable = true;
  model = "ggml-base.en";  # Smaller, faster model
};

Or with GPU acceleration:

cli.stt-ptt = {
  enable = true;
  # Choose one:
  whisperPackage = pkgs.whisper-cpp-vulkan;  # Vulkan (pre-built)
  # whisperPackage = pkgs.whisper-cpp.override { cudaSupport = true; };  # NVIDIA
  # whisperPackage = pkgs.whisper-cpp.override { rocmSupport = true; };  # AMD
};

stt-ptt Home Manager Module - Module documentation
Adding Packages - How to add new packages

4.8 KiB Raw Permalink Blame History