conversation

Name: conversation
Rating: 75
Author: alamparelli

by alamparelli

Voice interaction for Claude Code - Talk to Claude and hear responses using macOS speech synthesis and Parakeet MLX

⭐ 5🍴 2📅 Jan 24, 2026

apple claude-code conversation mcp silicon speech-to-text stt text-to-speech

View on GitHub Run in Manus

SKILL.md

name: conversation description: "Bidirectional voice conversation with Push-to-Talk. Use when user says: 'conversation mode', 'let's talk', 'parlons', 'voice conversation', 'dialogue vocal', 'PTT mode', or wants to speak WITH Claude (not just listen). For one-way TTS (Claude speaks, user types), use /speak instead." user_invocable: true

Conversation Mode - Voice Loop with Push-to-Talk

You now have access to both text-to-speech (claude-say) AND speech-to-text (claude-listen) for a complete voice conversation.

Architecture

Uses simple Push-to-Talk (PTT) mode:

Press hotkey to start recording
Press again to stop and transcribe
No automatic voice detection
Full control over when to record

Available MCP Tools

claude-listen (STT - Push-to-Talk)

Synchronous mode (blocking) - RECOMMENDED:

Tool	Description
`start_ptt_mode(key?)`	Start PTT mode (default: Left Cmd + S)
`stop_ptt_mode()`	Stop PTT mode
`get_ptt_status()`	Get PTT state
`get_segment_transcription(wait?, timeout?)`	Wait for transcription (default timeout: 120s). Returns status: [Ready], [Recording...], [Transcribing...]

Background mode (non-blocking) - Alternative:

Tool	Description
`start_ptt_background(key?)`	Start PTT in background process
`check_transcription()`	Check for new transcription (non-blocking)
`stop_ptt_background()`	Stop background PTT

claude-say (TTS)

Tool	Description
`speak(text, voice?, speed?)`	Queue text, returns immediately (preferred for natural flow)
`speak_and_wait(text, voice?, speed?)`	Speak and wait for completion (use when expecting response)
`stop_speaking()`	Stop immediately

When to use which TTS tool

IMPORTANT - Natural Speech Pattern:

speak(): Use for normal responses. One single speak() call with your complete answer is the default.
speak_and_wait(): ONLY use when you have a VERY LONG response broken into multiple parts. Put speak_and_wait() at the END to ensure all speech completes before listening.
Default speed: Always use speed=1 (1.0) for natural pacing.

Best practice - use speak() for normal responses:

# For typical responses, use ONE speak() call:
speak("I understand completely. The function you're looking for handles authentication and it's located in the auth module. It validates tokens and manages user sessions.", speed=1)

Only use speak_and_wait() for very long multi-part explanations:

# For very long responses that must be split:
speak("First part of a very detailed explanation that covers the initial concept.", speed=1)
speak("Second part that continues with more details.", speed=1)
speak_and_wait("Final part that concludes the explanation.", speed=1)  # Only the last one waits

Why this matters: speak() returns immediately without blocking. speak_and_wait() blocks until speech completes, which is only needed when breaking long responses into parts to ensure proper sequencing.

How It Works

┌─────────────────────────────────────────────────┐
│              Push-to-Talk Mode                  │
│                                                 │
│  [Left Cmd + S] → Start recording               │
│       │                                         │
│       │     (records continuously)              │
│       │                                         │
│  [Left Cmd + S] → Stop → Save → Transcribe      │
│       │                                         │
│       ↓                                         │
│  Claude responds vocally                        │
└─────────────────────────────────────────────────┘

User presses Left Cmd + S to start recording
Audio is captured continuously
User presses Left Cmd + S again to stop
Audio is saved and transcribed with the configured STT engine
Claude processes and responds vocally

Starting Conversation Mode

# 1. Start PTT mode
start_ptt_mode()  # Uses default key: cmd_l+s

# 2. Confirm vocally (short message only)
speak_and_wait("Prêt.")

# 3. Wait for transcription
transcription = get_segment_transcription(wait=True, timeout=120)

# 4. Process and respond (use speak() for natural flow, speak_and_wait() at the end)
speak("Here's what I found.")
speak("The first point is this.")
speak_and_wait("What would you like to know next?")  # Blocks before listening

# 5. Loop back to step 3

Conversation Loop

# Main loop
while True:
    # Wait for transcription
    text = get_segment_transcription(wait=True, timeout=120)

    # Check for end command
    if "fin de session" in text.lower():
        break

    # Check for timeout
    if "Timeout" in text:
        speak_and_wait("Tu es toujours là?")
        continue

    # Process and respond - use speak() for flow, speak_and_wait() at end
    speak("I understand your question.")
    speak("Let me explain.")
    speak_and_wait("Does that make sense?")  # Last message blocks

# End session
stop_ptt_mode()
speak_and_wait("Désactivé.")

Ending Conversation Mode

When user says "fin de session" (or similar):

stop_ptt_mode()
speak_and_wait("Désactivé.")

Background Mode (Non-Blocking) - Alternative

Background mode uses polling instead of blocking. Use this if you need Claude to do other tasks while waiting for speech.

Starting Background Mode

# 1. Start background PTT
start_ptt_background()  # Returns immediately

# 2. Confirm vocally
speak_and_wait("Prêt.")

# 3. Poll for transcriptions (non-blocking)
result = check_transcription()
# Returns: transcription text, or status like "[Ready...]", "[Recording...]"

Background Conversation Loop

import time

while True:
    # Non-blocking check
    result = check_transcription()

    # Check if it's actual transcription (not status message)
    if not result.startswith("["):
        # Got real transcription!
        if "fin de session" in result.lower():
            break

        # Respond
        speak_and_wait(f"Tu as dit: {result}")

    # Small delay before next check
    time.sleep(0.5)

# End session
stop_ptt_background()
speak_and_wait("Désactivé.")

When to use Background Mode

When you need Claude to perform other tasks while waiting
When synchronous mode times out frequently
Note: Creates more visible tool calls in the interface

Important Rules

Use speak() for natural flow - Queue multiple sentences without blocking
Use speak_and_wait() at the end - Only when you need to wait for user response
No code vocally - Never read code, paths, or logs aloud
Match language - Respond in the same language as the user
Detailed responses by default - Give thorough, complete explanations naturally. Technical topics, concepts, and questions deserve full answers. Don't artificially shorten responses.
Execute directly - Don't announce actions, just do them and report results
Minimal activation messages - Use ONE word only for activation ("Ready", "Prêt", etc.) and deactivation ("Disabled", "Désactivé", etc.) in the user's language
Show visual content proactively - When explaining concepts, processes, or technical topics, don't hesitate to display diagrams, tables, code snippets, or structured lists on screen. Voice mode doesn't mean text-only - use the screen as a visual aid. If something would be clearer with a diagram or example, show it while explaining verbally.

Error Handling

If timeout (no speech): speak_and_wait("Tu es toujours là?")
If transcription unclear: speak_and_wait("Je n'ai pas compris, peux-tu répéter?")

Available Keys for PTT

Key	Name
`cmd_l+s`	Left Command + S (default)
`cmd_r+m`	Right Command + M
`cmd_r`	Right Command
`cmd_l`	Left Command
`alt_r`	Right Option
`alt_l`	Left Option
`ctrl_r`	Right Control
`f13`, `f14`, `f15`	Function keys

Score

Total Score

75/100

Based on repository quality metrics

✓SKILL.md

SKILL.mdファイルが含まれている

+20

✓LICENSE

ライセンスが設定されている

+10

✓説明文

100文字以上の説明がある

+10

○人気

GitHub Stars 100以上

0/15

✓最近の活動

1ヶ月以内に更新

+10

○フォーク

10回以上フォークされている

0/5

✓Issue管理

オープンIssueが50未満

✓言語

プログラミング言語が設定されている

✓タグ

1つ以上のタグが設定されている

Reviews

💬

Reviews coming soon

conversation

SKILL.md

Conversation Mode - Voice Loop with Push-to-Talk

Architecture

Available MCP Tools

claude-listen (STT - Push-to-Talk)

claude-say (TTS)

When to use which TTS tool

How It Works

Starting Conversation Mode

Conversation Loop

Ending Conversation Mode

Background Mode (Non-Blocking) - Alternative

Starting Background Mode

Background Conversation Loop

When to use Background Mode

Important Rules

Error Handling

Available Keys for PTT

Score

Reviews

create-pr

orpc-contract-first

component-refactoring

web-design-guidelines

frontend-code-review

frontend-testing

conversation

SKILL.md

Conversation Mode - Voice Loop with Push-to-Talk

Architecture

Available MCP Tools

claude-listen (STT - Push-to-Talk)

claude-say (TTS)

When to use which TTS tool

How It Works

Starting Conversation Mode

Conversation Loop

Ending Conversation Mode

Background Mode (Non-Blocking) - Alternative

Starting Background Mode

Background Conversation Loop

When to use Background Mode

Important Rules

Error Handling

Available Keys for PTT

Score

Reviews

Related

Related Skills

create-pr

orpc-contract-first

component-refactoring

web-design-guidelines

frontend-code-review

frontend-testing