voice-agents

Name: voice-agents
Rating: 95
Author: sickn33

by sickn33

The Ultimate Collection of 200+ Agentic Skills for Claude Code/Antigravity/Cursor. Battle-tested, high-performance skills for AI agents including official skills from Anthropic and Vercel.

⭐ 1,237🍴 348📅 Jan 23, 2026

agentic-skills ai-agents antigravity autonomous-coding claude-code mcp react-patterns security-auditing

View on GitHub Run in Manus

SKILL.md

name: voice-agents description: "Voice agents represent the frontier of AI interaction - humans speaking naturally with AI systems. The challenge isn't just speech recognition and synthesis, it's achieving natural conversation flow with sub-800ms latency while handling interruptions, background noise, and emotional nuance. This skill covers two architectures: speech-to-speech (OpenAI Realtime API, lowest latency, most natural) and pipeline (STT→LLM→TTS, more control, easier to debug). Key insight: latency is the constraint. Hu" source: vibeship-spawner-skills (Apache 2.0)

Voice Agents

You are a voice AI architect who has shipped production voice agents handling millions of calls. You understand the physics of latency - every component adds milliseconds, and the sum determines whether conversations feel natural or awkward.

Your core insight: Two architectures exist. Speech-to-speech (S2S) models like OpenAI Realtime API preserve emotion and achieve lowest latency but are less controllable. Pipeline architectures (STT→LLM→TTS) give you control at each step but add latency. Mos

Capabilities

voice-agents
speech-to-speech
speech-to-text
text-to-speech
conversational-ai
voice-activity-detection
turn-taking
barge-in-detection
voice-interfaces

Patterns

Speech-to-Speech Architecture

Direct audio-to-audio processing for lowest latency

Pipeline Architecture

Separate STT → LLM → TTS for maximum control

Voice Activity Detection Pattern

Detect when user starts/stops speaking

Anti-Patterns

❌ Ignoring Latency Budget

❌ Silence-Only Turn Detection

❌ Long Responses

⚠️ Sharp Edges

Issue	Severity	Solution
Issue	critical	# Measure and budget latency for each component:
Issue	high	# Target jitter metrics:
Issue	high	# Use semantic VAD:
Issue	high	# Implement barge-in detection:
Issue	medium	# Constrain response length in prompts:
Issue	medium	# Prompt for spoken format:
Issue	medium	# Implement noise handling:
Issue	medium	# Mitigate STT errors:

Works well with: agent-tool-builder, multi-agent-orchestration, llm-architect, backend

Score

Total Score

95/100

Based on repository quality metrics

✓SKILL.md

SKILL.mdファイルが含まれている

+20

✓LICENSE

ライセンスが設定されている

+10

✓説明文

100文字以上の説明がある

+10

✓人気

GitHub Stars 1000以上

+15

✓最近の活動

1ヶ月以内に更新

+10

✓フォーク

10回以上フォークされている

✓Issue管理

オープンIssueが50未満

✓言語

プログラミング言語が設定されている

✓タグ

1つ以上のタグが設定されている

Reviews

💬

Reviews coming soon

voice-agents

SKILL.md

Voice Agents

Capabilities

Patterns

Speech-to-Speech Architecture

Pipeline Architecture

Voice Activity Detection Pattern

Anti-Patterns

❌ Ignoring Latency Budget

❌ Silence-Only Turn Detection

❌ Long Responses

⚠️ Sharp Edges

Score

Reviews

create-pr

orpc-contract-first

component-refactoring

web-design-guidelines

frontend-code-review

frontend-testing

voice-agents

SKILL.md

Voice Agents

Capabilities

Patterns

Speech-to-Speech Architecture

Pipeline Architecture

Voice Activity Detection Pattern

Anti-Patterns

❌ Ignoring Latency Budget

❌ Silence-Only Turn Detection

❌ Long Responses

⚠️ Sharp Edges

Related Skills

Score

Reviews

Related

Related Skills

create-pr

orpc-contract-first

component-refactoring

web-design-guidelines

frontend-code-review

frontend-testing