How do I start dictating with Murmly?

Hold Ctrl+Shift+Space, speak, and release. Murmly inserts clean, punctuated text at your cursor in any app, usually in under 300 milliseconds.

How do I run a voice command?

Hold Ctrl+Shift+Enter and speak the instruction (for example, "reply to Sarah that it sounds good"). Murmly previews the action and runs it, with a 60-second undo.

How many voice commands does Murmly support?

Murmly supports 100+ voice command verbs across email, calendar, messaging, tasks, documents, spreadsheets, web search and more.

Docschevron_rightGetting startedchevron_rightInstall Murmly

DOCUMENTATION

Everything Murmly can do.

Murmly is a voice-first work operating system for Windows (macOS in beta). Hold one hotkey to dictate; hold another to command. Speech is transcribed locally in under 300 milliseconds — your audio never leaves your machine. From there it can act across 50+ apps, run on a schedule, react to events, and remember what matters to you. This is the full surface.

<300ms

Speech to text, locally

100+

Voice commands

50+

Native integrations

Languages

Install Murmly

Murmly is an ~8 MB desktop app for Windows 10 (1803+) and Windows 11; a macOS build is in beta. WebView2 is required and ships pre-installed on Windows 11. Download the installer, run it (about 90 seconds), then sign in and pick your hotkeys.

01Download Murmly for Windows — no admin rights for a per-user install.
02On first run Murmly downloads its local speech model (Parakeet / Whisper) once.
03Sign in, grant microphone access, and confirm your hotkeys. Dictation then works fully offline.

macOS: the first launch walks you through Accessibility, Microphone and Automation permissions. Insertion uses the AX API with an NSPasteboard + Cmd+V fallback.

Plans, trial & limits

Every Murmly account starts with a 14-day free trial — no credit card. During the trial you can use the app at the Free limits below. When the 14 days are up, Murmly stops working until you upgrade to Pro — there is no ongoing free tier.

Free trial · 14 days

check30 minutes of dictation
check25 voice commands per day
checkOn-device models (no cloud)
scheduleStops after 14 days

Pro

checkUnlimited dictation & commands
checkPremium cloud models
checkSchedules, triggers & automation

Upgrade anytime at murmly.io/pricing or from Settings → Account in the app. Cloud commands and automation are Pro-only and stay off during the trial.

Your first dictation

Click into any text field — Slack, Gmail, your IDE, a Google Doc — then hold the dictation hotkey and speak. Release the key and the cleaned-up text appears at your cursor, usually in under 300 milliseconds. Punctuation, capitalization and filler-word removal happen for you.

hold Ctrl + Shift + Space

"hey sarah, friday works for the review — i'll send the deck tonight"

→ inserted, punctuated, sentence-cased

Hotkeys & repeat

Murmly uses global hold-to-talk hotkeys — press and hold while speaking, release to execute. Every one is rebindable in Settings.

Ctrl + Shift + SpaceDictation — insert text at the cursor.

Ctrl + Shift + EnterCommands — act across your apps.

Ctrl + Shift + HHand Canvas — summon your glanceable workspace.

double-tapRepeat your last action instantly.

Onboarding & your profile

A short wizard sets your appearance theme, privacy posture, hotkeys, voice model and accounts. Murmly keeps a local SQLite profile that learns your vocabulary, your frequent corrections and the contacts and actions you use most — so "Sara" stops becoming "Sarah" and your common commands get faster. Nothing in this profile leaves your device.

Meet your secretary

The wizard ends with a short, spoken interview. Your secretary asks about your role, your work and projects, your goals, the people who matter, how you like to communicate, your schedule preferences and even your hobbies — a real back-and-forth, not a form. After each answer it quietly extracts the durable facts and saves them to your local memory, so every later command, briefing and conversation already knows who you are.

It rides the same voice plumbing as everything else — speak your answers, interrupt mid-sentence, skip a topic. You can re-run it any time, and the dashboard nudges you about anything it still doesn't know.

How dictation works

Audio is captured via WASAPI and transcribed on your own CPU/GPU using NVIDIA Parakeet (with a faster-whisper fallback). The transcript is lightly cleaned — capitalization, punctuation, filler removal — then inserted with UI Automation, falling back to SendInput and clipboard paste so it works in any field.

Hotkey → WASAPI capture → local Whisper/Parakeet → cleanup → insert at cursor

App-aware cleanup

Murmly detects the focused app and field type, so the same words land differently where it matters: a casual tone in Slack, formal in Gmail, code-style in your editor, plain text in a terminal. "Um" and false starts never reach the screen.

Rewrite & translate in place

Select any text, hold the command hotkey, and tell Murmly what to do with it — rephrase, fix the grammar, make it more formal, or translate it. The selection is read, transformed, and pasted back over itself.

"make this more concise and professional"

→ selection rewritten in place · 60-second undo

The 60-second undo

Every insert and every mutating command can be rolled back for 60 seconds. Tap the hotkey again, or hit the undo toast, to reverse the last action — a sent reply, a created event, a moved deal. Nothing is one-way.

How commands work

In command mode you speak intent, not syntax. Murmly parses what you mean into a structured action, resolves contacts and references against your data, and executes through native integrations. Intent parsing runs on a cloud model by default or fully on-device with the local Murmly Voice model — your choice, set in the wizard.

"open Word" · "search the web for the Q3 CPI print" · "start a Teams call with Owais"

→ parsed → risk-checked → executed → 60-second undo

The agent loop

Compound requests run as a short multi-step agent loop — each step feeds the next, with a live indicator showing turn N of M. If a step fails mid-loop, the work already done is preserved and surfaced rather than silently lost.

"Reply to Usama's last email saying I'll send the proposal Monday."

→ find contact → search email → draft reply → preview → send

Risk tiers & confirmation

Every command has a risk tier — a single source of truth that decides whether it runs immediately or waits for you.

LOW

Read-only or reversible — search, summarize, open app, read email. Auto-executes with a 5-second undo.

MEDIUM

Creates or changes things — calendar event, reply, Slack message, move a deal. Shows a preview card first.

HIGH

Sends or is hard to reverse — send email, merge a PR. Always confirmed, with a warning, before it happens.

The full command catalog

Over a hundred intents across these families. Say them naturally — Murmly maps your phrasing to the right verb and fills in the details from context.

mailEmail

sendreplyreply-allforwardreadsearchsummarize inboxsummarize threadtriage

calendar_monthCalendar & meetings

create eventreschedulecancelRSVPfind timejoin meetingschedule withmeeting prepwhat's next

descriptionDocuments

createeditformatsummarizeread aloudexportsharefrom template

table_chartSpreadsheets & slides

read cellupdate rangeappend rowrun formulacreate deckadd slideedit slide

folder_openFiles & web

find fileopen appsearch webopen web apprun web actionrecordreplay

checklistTasks & issues

create taskset statuscommentassignadd subtasklist my tasks

mergeCode hosts

create issueclose issuecreate PRreview PRmerge PRsummarize PRcheck CI

handshakeCRM

find contactsummarize pipelinecreate dealmove stageadd notelog activity

forumWikis & Slack

search wikiread pagecreate pageappendread channelDMset status

psychologyMemory & secretary

rememberrecallforgetreview learningslet's talkweigh inself-briefing

scheduleSchedules & triggers

create schedulepauseresumecreate triggerlistremind me

terminalSystem

open appclose appstart callrewrite selectioninsert textclarify

Email: send, reply, forward

Gmail, Outlook and Hotmail behind one registry, with smart routing — work-domain recipients go out through Outlook, consumer domains through Gmail. Reply-all, forward with a note, cc/bcc, and attachments (by path, "the file I just downloaded", or from your clipboard) all work by voice. Outlook supports true scheduled send; Gmail falls back to a draft.

"Send the proposal PDF I just downloaded to Sarah with subject Q4 proposal."

→ resolve Sarah → attach most-recent file → preview → send

Read & summarize

"Summarize my inbox", "read me the last email from Chen", "what's the gist of this thread" — Murmly reads, summarizes and speaks results aloud with on-device-friendly text-to-speech. Search email by sender, subject or content, and chain results into a follow-up action.

Hands-free inbox triage

Say "triage my inbox" and Murmly enters a mode: it reads each unread email aloud (sender, subject, body) and acts on your spoken decision — reply, delete, archive, snooze, next, back, or stop — then gives a tally at the end. No clicking; just talk through your inbox. Decisions are parsed locally for ~200ms turnaround.

reply …deletearchivesnooze tomorrownextbackstop

Calendar & meetings

Microsoft and Google calendars under one interface. Create, reschedule and cancel events; RSVP to invitations; "join my next meeting" opens the Teams / Zoom / Meet link for whatever's soonest; "schedule 30 minutes with Sarah and Owais tomorrow" creates the event with an online meeting attached and invites everyone. Ask "what's next?" or for a meeting-prep briefing before you walk in.

Find time across calendars

"Find time with Chen this week" fans free/busy across every connected calendar, merges the busy intervals, respects your business hours and time-of-day preference, avoids back-to-back where you ask it to, scores the gaps, and reads back the best three slots — ready to chain straight into scheduling.

Documents

Microsoft Word (local via COM, or OneDrive via Graph) and Google Docs behind one interface. Create, edit (insert / replace / add headings / tables), format, summarize, read aloud, export to PDF/HTML/TXT/Markdown, and share. Generate whole documents from a built-in template library — proposals, invoices, contracts, meeting notes, one-pagers — by dictating the slots to fill.

Spreadsheets

Local Excel and Google Sheets. Read a cell or range, update cells, append rows, run a formula and hear the result, and export to PDF. "Sum column C and email Sarah the total" chains the read into a send through the agent loop.

Slides

Generate a PowerPoint deck from scratch or a Google Slides presentation from a spoken outline — bullets, sections and speaker notes are parsed into real slides with the right layouts. Add or edit individual slides by voice afterward.

Find files

A background indexer keeps a private, on-device full-text index of your Downloads, Desktop, Documents and OneDrive. "Find the proposal I worked on yesterday", "where's the Q4 spreadsheet", "the PDF Sarah sent last week" — results return in milliseconds and can be attached straight to an email in the same breath.

Browser automation

A policy-gated, audited Playwright controller drives real browser sessions with pre-built scripts for Asana, Salesforce, HubSpot, Jira and LinkedIn. Record your own flow once, then replay it by voice — Murmly detects the variable fields (names, amounts, dates) and substitutes them on each run. Every navigation is audited as a hashed URL, never the destination in clear text.

Browser agent & deep research

beta

Murmly has its own browser window. Sign into ChatGPT, Google and any sites you want it to use once — it stays signed in, so the agent picks up where you left off. (It runs a separate browser from your everyday Chrome; Windows blocks apps from reusing your main browser's logins, and an optional extension lets you lend your real Chrome session when you need it.)

Give it a free-form task and it works the page like you would: it reads the screen, decides the single next action, clicks or types, and repeats — streaming every step live so you can watch. Switch to other windows; it keeps going and you'll see the progress when you come back.

"Research the three best note-taking apps for teams and email me a comparison."

→ opens browser → visits sources → reads each → drafts the comparison → emails it to you

Deep research mode removes the early step limit and keeps going across many pages until the task is genuinely done — best for big research and planning jobs. It takes longer and uses more of your cloud quota, and a Stop button ends it whenever you like. Like everything else, it runs through your cloud/local choice, the policy gate and the audit log.

Tasks & issue trackers

Linear, Asana, ClickUp and Monday. Create tasks, change status, comment, assign, add subtasks and list what's assigned to you — "create a Linear ticket to fix the login bug and assign it to me", "move MUR-241 to In Progress".

LinearAsanaClickUpMonday

Code hosts

GitHub and GitLab under one interface. Create / comment on / close issues, open and review pull requests, check CI status, summarize a PR or recent builds, and merge — with merge gated as a high-risk, always-confirmed action. "Summarize the open PR on the auth repo and tell me if CI is green."

CRM

HubSpot, Salesforce and Pipedrive. Find a contact, summarize your pipeline, create a deal, move a deal to a stage, log a call or meeting, and attach notes. "Find Sarah at Acme and create a 50 K deal for her in the proposal stage."

Meeting platforms

Zoom and Webex. Schedule meetings, list upcoming, and — the useful part — fetch and search recording transcripts, summarize a meeting, and extract structured action items that chain straight into Linear tickets. A real-time meeting co-pilot with live suggestions and recaps is in the works.

Wikis

Notion and Confluence. Search across every connected workspace, read and summarize a page, create a page, and append to one — dictated markdown is converted to each provider's native block or storage format. Creates land in the trash on undo so nothing is truly destructive.

Slack

Beyond sending a message: read a channel, summarize a thread, search across channels and people, DM several people at once (a single undo unsends them all), and set your status with an auto-expiry.

Murmly reads and summarizes your WhatsApp chats by voice — "summarize my chat with Rudi" pulls the recent messages and gives you the gist — and those summaries feed into your daily and weekly briefings. It can send messages too, typed through a real, human-paced keystroke path rather than a synthetic click. Connect it once in Settings; it runs through Murmly's own browser session.

Long-term memory

Murmly remembers facts about you — relationships, projects, preferences, goals and boundaries — in a strictly local knowledge base. "Remember Sarah is my engineering manager." "What do you know about me?" "Forget what you know about my old job." These memories are injected into the command parser, so Murmly can push back: tell it you never take meetings before 9am and it will object when you try to schedule a 7am call.

FactPreferenceProjectRelationshipGoalBoundary

Proactive learning

In the background, Murmly proposes things it has noticed — your top contacts, the folders you live in, the commands you repeat, patterns from your conversations. "Review what you've learned" lists the suggestions with a confidence score; "approve" promotes one to a confirmed memory, "dismiss" drops it. It only ever suggests — you stay in control of what it commits.

Talk mode

Say "let's talk" and Murmly opens a real conversation — your next utterances bypass the command parser and flow into a chat that knows your memory, your recent activity, and the last dozen turns. On speakers, acoustic echo cancellation lets you interrupt mid-sentence (barge-in), no headphones required. Safe read actions ("summarize that thread") still execute inside the conversation. "Stop", "thanks, that's all" or "talk later" ends it.

Decision support

"Should I take the 7am call with Sarah?" "Weigh in on whether to push the launch." Murmly pulls your memory, recent activity and cross-source synthesis into a reasoned take — it argues a position rather than punting back to you. Authority stays at suggest-only: it advises, you decide.

Briefings & reminders

Murmly emails you an organised day-ahead, week-ahead or week-recap briefing — pulling sent and received mail, WhatsApp, calendar, reminders and live data connections, then joining the dots ("someone who emailed you is on today's calendar"). Set voice reminders too: "remind me to call Owais in five minutes" fires with a spoken nudge, exactly once, even across restarts.

Scheduled tasks

Set work to run on a clock by voice — "every Monday at 9am", "daily at 8", "on the 15th of every month", "morning briefing between 8 and 9 on weekdays", or a raw cron expression. A background daemon runs them whether or not the main window is open, re-parses the command at fire time, and executes it.

"Every weekday at 8am, summarize my inbox and email me the digest."

→ schedule created · runs in the daemon · pause / resume / delete by voice

Event triggers

React to the world, not just the clock: when an email arrives from a VIP, before a meeting, on a Slack mention, when a file lands in a folder, or on an inbound webhook. Each trigger has debounce and per-day caps so it can't run away, and inbound webhooks are HMAC-verified with payload placeholders expanded into the command.

Long-running agent jobs

Bigger asks ("research three competitors and draft a comparison") run as background jobs — up to 30 turns, with a live progress card, cancellation, and restart recovery. A quality gate pauses for your review before anything risky or expensive, presenting the proposed actions for approval.

Cost caps & circuit breakers

Autonomy is fenced. A daily ($5) and monthly ($50) cost cap is checked before any autonomous run. Circuit breakers watch outcomes — five consecutive failures auto-disables a schedule or trigger; repeated budget or runtime-cap hits raise a warning — and you reset them explicitly. Every autonomous action is written to the tamper-evident audit log with a triggered-by chain.

Gesture control

Toggle hand tracking with Ctrl + Shift + G and drive actions with one- and two-hand gestures — palms-apart to split windows, fists-rotate to switch desktops, a sweep to dismiss. Bindings are context-aware (a fist means mute in a meeting, volume elsewhere), support macro sequences, and you can train custom gestures. Everything is computed locally from landmarks; the webcam is off whenever tracking is off.

Hand Canvas

A transparent, always-on-top workspace summoned by gesture or Ctrl + Shift + H. It shows the right widgets for the moment — calendar, inbox, blockers, meeting prep, tasks — picked by a curator that reads the time of day and your context. Pinch to focus a widget, sweep to dismiss, drag to rearrange, save layouts as templates, and let IT push managed canvases via MDM.

Live data & widgets

Beyond the built-in calendar, inbox, blockers and meeting-prep widgets, you can wire up your own live data connections — a feed, an API, a number you care about — and pin them as widgets on the dashboard and Hand Canvas. Add alert rules so a widget flags you when a value crosses a threshold, and the secretary folds those signals into your briefings.

Languages

Dictation and commands in English, Spanish, French, German and Mandarin, with native-script display, per-language prompts and voices, and a dedicated Chinese date/time parser. The detected language is locked across a multi-step agent run so it never drifts mid-task.

EnglishEspañolFrançaisDeutsch中文

Mobile & wearables

Companion apps for iOS and Android bring the full voice command surface on the go, with on-device Whisper fallback, driving mode, lock-screen and CarPlay/Android Auto entry points, and bandwidth-aware sync (Opus over cellular, PCM over Wi-Fi). An Apple Watch app handles quick approvals and briefings, and cross-device handoff lets you start on your phone and finish on the desktop.

Voice identity

Optional owner-voice verification gates sensitive actions to you. A speaker-embedding model plus anti-spoofing and a random liveness challenge raise the bar as risk rises, with a PIN fallback and lockout after repeated failures. Embeddings are stored locally and DPAPI-wrapped; the audit log records only a hash, never your voiceprint.

Privacy & local Whisper

Local-first is the default and the moat. Speech is transcribed on-device; audio never leaves your machine. Cloud calls — command parsing or optional cleanup — are opt-in, gated at the HTTP layer, and fail closed: if policy says no network, the cloud client is refused outright, not silently retried. Turn cloud off and dictation stays 100% on-device. Memory, profiles, layouts and templates are local-only.

Read the privacy architecture arrow_forward

Audit log

Every cloud call and every mutating action is written to a hash-chained, tamper-evident audit log — each row's hash folds in the previous one, so any edit is detectable. Email metadata is stored as PII-safe hashes (recipients and subjects are never written in clear text). A built-in verifier confirms the chain is intact and reports the first row that doesn't.

Enterprise & policy

A hierarchical policy resolver (user → group → org → deployment) lets one binary serve prosumer, team, org and air-gapped enterprise from a single code path. Ship signed deployment manifests via MDM, enforce local-only modes, set per-email-account rules (force local parsing, block external forwarding, cap recipients), route model calls through your own keys or gateway, and provision via SCIM. A managed multi-tenant cloud runtime is available for teams who want it hosted.

Compliance

Built for regulated buyers: SOC 2 Type II evidence automation, a HIPAA BAA-ready variant with PHI detection, FedRAMP Moderate and High baselines, ISO 27001 and StateRAMP coverage, plus GDPR and a DPA. Custom plugins run in platform-native sandboxes (seccomp + cgroups on Linux, Seatbelt on macOS, AppContainer + Job Objects on Windows).

SOC 2 Type IIHIPAAFedRAMPISO 27001StateRAMPGDPR

Integrations & MCP

50+ native integrations — no screen-scraping. Each loads independently over MCP and can be disabled on its own. Connect them with OAuth in Settings, or add your own MCP server with a YAML manifest — declare auth, endpoints, tools and undo, and it shows up as a voice verb.

GmailOutlookGoogle CalendarSlackLinearNotionGitHubGitLabJiraAsanaClickUpMondayHubSpotSalesforcePipedriveZoomWebexConfluenceWhatsAppTeams+30 more

See the full list on the integrations page.

Plugins & marketplace

Extend Murmly with third-party plugins — native dylibs, WASM modules, or YAML-declared MCP servers — each Ed25519-signed with a trust level (Anthropic, enterprise, user, marketplace) and granted only the capabilities its manifest declares. Share templates and web-automation scripts through the marketplace, with moderation and verified publishers.

Public API & SDKs

A public HTTPS REST API submits commands, streams session events, reads jobs and templates, and tails the audit log — scoped by OAuth token. TypeScript and Python SDKs are generated against the same wire contract, and a browser extension bridges page context into your runtime over native messaging.

Ready to stop typing?

14-day free trial — 30 dictation minutes & 25 commands/day, then upgrade to Pro. Windows today, macOS in beta.

Start free trial