speech-to-cli

Background Log Watcher Pattern

Use a Claude Code background agent to monitor system logs in real-time and announce anomalies aloud via TTS through the MCP protocol.

What This Is

A background Claude Code agent tails a log source (e.g. journalctl -k -f), pattern-matches for concerning events, and calls the speak MCP tool to deliver audio alerts — all while the main conversation continues working normally.

No HTTP API is involved. The agent calls speak as an MCP tool — the same mechanism Claude Code uses for all its tools. The speech-to-cli MCP server runs as a subprocess of Claude Code, communicating over stdio JSON-RPC.

The main agent can read the watcher's output file at any time to see what was detected and act on it autonomously.

How the Pieces Fit Together

System Architecture
Foreground
Main Claude Code
Your conversation. Spawns the watcher, continues working independently.
Agent tool
background: true
Background
Watcher Agent
Separate subprocess with own context window.
  • tails log source
  • polls every ~30s
  • speaks on anomaly
MCP tool call
stdio JSON-RPC
MCP Server
speech-to-cli
Routes speak() calls through Azure TTS to PipeWire.
  • mcp_speech.py
  • handle_request()
  • speech_tts.tts()
HTTP POST
48kHz audio
Cloud
Azure TTS
Cognitive Services endpoint returns PCM audio stream.

Audio plays through PipeWire (pw-cat) → your speakers

Data Flow for a speak() Call

When the background agent decides to announce something, here's the exact chain:

1

Agent LLM generates tool_use

The model outputs a structured tool call in its response.

{"name": "mcp__speech-to-cli__speak",
 "input": {"text": "Critical: xHCI controller died",
           "voice": "en-US-DavisNeural"}}
2

Claude Code harness serializes to JSON-RPC

The harness translates the tool call to MCP protocol over stdio.

{"method": "tools/call",
 "params": {"name": "speak", "arguments": {...}}}
3

mcp_speech.py routes to tts()

handle_request() receives on stdin, dispatches to speech_tts.tts().

4

_prepare_tts() builds SSML

Constructs the XML payload with voice, rate, and pitch parameters.

5

HTTP POST to Azure TTS

Cognitive Services endpoint returns a 48kHz PCM audio stream.

6

Audio piped to pw-cat

PipeWire player streams directly to your speakers. You hear it.

The agent never touches HTTP directly. The MCP protocol abstracts the entire Azure TTS pipeline — the agent just calls speak like any other tool.

How to Launch

From the main Claude Code conversation, use the Agent tool:

Agent tool call:
  description: "Watch [system] logs and speak alerts"
  name: "my-watcher"
  run_in_background: true
  prompt: |
    You are a [SYSTEM] log monitor. Your job is to tail [LOG_SOURCE]
    for [SPECIFIC EVENTS] and use the speech MCP tool to announce
    anything concerning.

    Steps:
    1. Use the speak tool to announce: "[System] log watcher online."
    2. Run [LOG_COMMAND] in the background via Bash
    3. Every ~30 seconds, check the output for anomalies
    4. On anomaly: announce via speak tool with a brief summary
    5. On normal: stay quiet — only speak when something is wrong
    6. Keep monitoring indefinitely

    Use the mcp__speech-to-cli__speak tool for announcements.
    Keep them short and clear.YAML-ish

What the Agent Actually Does

The background agent is a separate Claude Code subprocess with its own context window. It has access to the same tools as the main conversation (Bash, MCP tools, etc.) but runs independently.

1
ToolSearch for "speak speech" — MCP tools are deferred, so the agent fetches the schema first
2
speak() — announces itself: “USB log watcher online.”
3
Bash with run_in_background: true — starts journalctl -k -f writing to a temp file
4
Bash with sleep 30 && tail -10 <output> — polls the journalctl output
5
LLM reasoning — reads the tail output, decides if anything is anomalous
6
speak() on anomaly — announces what it found, with context
Loop back to step 4 — runs indefinitely until killed

The polling pattern (sleep 30 && tail) is crude but effective. The Claude Code Bash tool has no streaming mode — foreground commands block until exit, background commands write to a file with no callback. So the agent must poll the output file periodically, even though the underlying journalctl -f is event-driven.

USB/xHCI Monitor — Real Crash Detection

Used during the kiyo-xhci-fix project to monitor for USB controller crashes caused by a Razer Kiyo Pro firmware bug.

prompt: |
  You are a USB log monitor. Tail the kernel logs for USB-related
  events and use the speech MCP tool to announce any concerning
  USB events (errors, resets, disconnects, watchdog triggers,
  xhci failures).

  1. Announce: "USB log watcher online. Monitoring for anomalies."
  2. Run: journalctl -k -f --grep='usb|xhci|USB' --no-pager
  3. Check output every ~30 seconds
  4. Announce anomalies: "Warning: USB controller reset detected"
  5. Stay quiet on normal traffic
  6. Monitor indefinitelyPrompt

Incident Timeline — March 21, 2026

The agent ran for ~30 minutes during USB stress testing and caught a full xHCI controller death in real time:

13:17
TRB Warnings (Precursor)
WARN Event TRB for slot 16 ep 2 with no TDs queued?
Repeated 7 times over 5 seconds. The xHCI controller is struggling with stalled endpoints on the Razer Kiyo Pro.
13:30:57
Controller Death
xHCI host not responding to stop endpoint command
xHCI host controller not responding, assume dead
HC died; cleaning up
13 minutes after the TRB warnings, the controller gives up entirely.
13:30:57 (same second)
Cascade Disconnect — 15 Devices
Every USB device on both bus 1 and bus 2 disconnected simultaneously: keyboard, mouse, Logitech receiver, Razer webcam, Bluetooth adapter, audio interface, USB storage. Total loss of USB input.

The Agent's Actual TTS Announcement

“Critical alert! The xHCI USB host controller has died. The kernel reports: host controller not responding, assume dead, cleaning up. All USB devices on both bus 1 and bus 2 have disconnected simultaneously. This includes keyboard, mouse, Logitech receiver, Razer webcam, USB storage, Bluetooth, and audio devices. Total of 15 device disconnects. You have lost all USB input. The earlier TRB warnings at 13:17 were a precursor to this failure.”

It then continued polling for recovery events every 30 seconds.

Reading Watcher Output

Three ways to check what the watcher detected, from most to least powerful:

1. SendMessage — talk to the running agent

If you gave the agent a name when spawning it (e.g. name: "usb-watcher"), the main conversation can send it a message and get a live response:

SendMessage:
  to: "usb-watcher"
  message: "what have you seen so far?"tool call

The watcher receives the message, processes it with full context of everything it’s observed, and responds. You’re having a conversation with the watcher while it continues monitoring.

Requirement: The agent must have been spawned with the name parameter. This is why the launch template includes name: "my-watcher".

2. TaskOutput — read the transcript passively

The main agent can call TaskOutput with the background task’s ID to read its full transcript. Just ask Claude naturally:

You don’t need to know the task ID or file paths — Claude tracks these internally.

3. Terminal — manual JSONL inspection

The watcher’s transcript is a JSONL file (one JSON object per line):

# Find the output file (path includes your UID and session ID)
ls /tmp/claude-$(id -u)/*/tasks/*.output

# Extract just the text the agent spoke aloud
grep -o '"name":"mcp__speech-to-cli__speak"[^}]*"text":"[^"]*"' \
  /tmp/claude-$(id -u)/*/tasks/<agent-id>.output

# Or get the agent's own commentary (not tool calls)
grep '"role":"assistant"' /tmp/claude-$(id -u)/*/tasks/<agent-id>.output \
  | grep -o '"text":"[^"]*"' | tail -20bash

Autonomous Response Loop

The powerful pattern: the main agent reads the watcher output and acts on it without human intervention.

Detect → Alert → Investigate → Fix
Background
Watcher Agent detects anomaly in kernel logs
⟶ TTS alert
⟶ output file
Human
Hears the spoken alert through speakers
↓ “check the watcher”
Foreground
Main Agent reads output → investigates → fixes issue → deploys

Caveat: Background agents don’t push notifications to the main conversation — the main agent must pull via SendMessage or TaskOutput. The user hearing the TTS alert is often what triggers them to say “check the watcher.”

Other Log Sources

Use CaseLog CommandWatch For
USB / xHCI journalctl -k -f HC died, not responding, disconnect
systemd service journalctl -u myservice -f error, failed, timeout
Nginx tail -f /var/log/nginx/error.log 502, upstream timed out
Kubernetes kubectl logs -f deploy/myapp OOMKilled, CrashLoopBackOff
Docker docker logs -f container ERROR, FATAL, stack traces
Build system tail -f build.log FAILED, error:, non-zero exit
Network / firewall logread -f (OpenWrt) DROP, REJECT, zone violations

Tips