dicto

Offline voice dictation for Linux. Press a global hotkey, speak, and the recognized text lands on your clipboard, ready to paste anywhere with Ctrl+V.

Features

Offline. Uses whisper.cpp locally — no audio leaves the machine.
GPU-accelerated. Vulkan backend on any modern GPU (NVIDIA / AMD / Intel). Inference for short phrases finishes in 1–3 seconds.
Global hotkey. Registered via xdg-desktop-portal GlobalShortcuts (default: Ctrl+Alt+Space, rebindable in system settings).
Tray icon. Visible state — idle / recording / processing.
Sound cues on each transition (opt-out via --no-sounds).
Auto language detection between English and Russian (and 97 other whisper languages).
Clipboard-only output — no keystroke synthesis, no ydotool / uinput setup. You paste with Ctrl+V.

Platforms

Linux on Wayland: GNOME 45+ or KDE Plasma.
Tested on Ubuntu 25.10.
Windows port planned.

Install (Debian/Ubuntu)

Download the latest .deb from Releases and:

sudo apt install ./dicto_0.1.0-1_amd64.deb

The whisper large-v3-turbo model (~1.6 GB) is bundled inside the package — no extra download needed. System dependencies (libasound2, libvulkan1, systemd, xdg-desktop-portal, …) are pulled in by apt automatically.

Usage

Launch dicto from your application menu, or run:

dicto-daemon

Then press Ctrl+Alt+Space to start recording, press again to stop. The transcript is copied to your clipboard — paste with Ctrl+V anywhere.

The shortcut can be rebound in Settings → Keyboard → View and Customize Shortcuts (look for the dicto entry).

Switching models

sudo dicto-fetch-model small         # smaller, faster, lower quality (~466 MB)
sudo dicto-fetch-model medium        # balanced (~1.5 GB)
sudo dicto-fetch-model large-v3      # highest quality, slower (~2.9 GB)
sudo dicto-fetch-model large-v3-turbo  # default — quality close to large-v3, much faster

Or per-user (no sudo, downloads to ~/.local/share/dicto/models/):

dicto-fetch-model medium

The daemon picks the per-user model first, falling back to the system-wide one.

Build from source

System packages required:

sudo apt install build-essential clang cmake libasound2-dev libvulkan-dev glslc

Then:

cargo build --release

Build a .deb:

cargo install cargo-deb
cargo deb --no-build   # uses target/release/dicto built above

How it works

Concern	Solution
Audio capture	`cpal` on a dedicated OS thread
Speech recognition	`whisper-rs` (whisper.cpp) with `use_gpu = true` via Vulkan
Global hotkey	`ashpd` + `xdg-desktop-portal` GlobalShortcuts
Tray icon	`ksni` (StatusNotifierItem)
Sound cues	`rodio` generating short sine-wave tones
Clipboard	`arboard`
GNOME `app_id` resolution	Launcher wraps the binary in `systemd-run --scope` with a name the portal can parse from `/proc/PID/cgroup`

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
packaging		packaging
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dicto

Features

Platforms

Install (Debian/Ubuntu)

Usage

Switching models

Build from source

How it works

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

dicto

Features

Platforms

Install (Debian/Ubuntu)

Usage

Switching models

Build from source

How it works

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages