スキル一覧に戻る
yonatangross

hyde-retrieval

by yonatangross

The Complete AI Development Toolkit for Claude Code — 159 skills, 34 agents, 20 commands, 144 hooks. Production-ready patterns for FastAPI, React 19, LangGraph, security, and testing.

29🍴 4📅 2026年1月23日
GitHubで見るManusで実行

SKILL.md


name: hyde-retrieval description: HyDE (Hypothetical Document Embeddings) for improved semantic retrieval. Use when queries don't match document vocabulary, retrieval quality is poor, or implementing advanced RAG patterns. tags: [rag, retrieval, hyde, semantic-search] context: fork agent: data-pipeline-engineer version: 1.0.0 author: OrchestKit user-invocable: false

HyDE (Hypothetical Document Embeddings)

Generate hypothetical answer documents to bridge vocabulary gaps in semantic search.

The Problem

Direct query embedding often fails due to vocabulary mismatch:

Query: "scaling async data pipelines"
Docs use: "event-driven messaging", "Apache Kafka", "message brokers"
→ Low similarity scores despite high relevance

The Solution

Instead of embedding the query, generate a hypothetical answer document:

Query: "scaling async data pipelines"
→ LLM generates: "To scale asynchronous data pipelines, use event-driven
   messaging with Apache Kafka. Message brokers provide backpressure..."
→ Embed the hypothetical document
→ Now matches docs using similar terminology

Implementation

from openai import AsyncOpenAI
from pydantic import BaseModel, Field

class HyDEResult(BaseModel):
    """Result of HyDE generation."""
    original_query: str
    hypothetical_doc: str
    embedding: list[float]

async def generate_hyde(
    query: str,
    llm: AsyncOpenAI,
    embed_fn: callable,
    max_tokens: int = 150,
) -> HyDEResult:
    """Generate hypothetical document and embed it."""

    # Generate hypothetical answer
    response = await llm.chat.completions.create(
        model="gpt-4o-mini",  # Fast, cheap model
        messages=[
            {"role": "system", "content":
                "Write a short paragraph that would answer this query. "
                "Use technical terminology that documentation would use."},
            {"role": "user", "content": query}
        ],
        max_tokens=max_tokens,
        temperature=0.3,  # Low temp for consistency
    )

    hypothetical_doc = response.choices[0].message.content

    # Embed the hypothetical document (not the query!)
    embedding = await embed_fn(hypothetical_doc)

    return HyDEResult(
        original_query=query,
        hypothetical_doc=hypothetical_doc,
        embedding=embedding,
    )

With Caching

from functools import lru_cache
import hashlib

class HyDEService:
    def __init__(self, llm, embed_fn):
        self.llm = llm
        self.embed_fn = embed_fn
        self._cache: dict[str, HyDEResult] = {}

    def _cache_key(self, query: str) -> str:
        return hashlib.md5(query.lower().strip().encode()).hexdigest()

    async def generate(self, query: str) -> HyDEResult:
        key = self._cache_key(query)

        if key in self._cache:
            return self._cache[key]

        result = await generate_hyde(query, self.llm, self.embed_fn)
        self._cache[key] = result
        return result

Per-Concept HyDE (Advanced)

For multi-concept queries, generate HyDE for each concept:

async def batch_hyde(
    concepts: list[str],
    hyde_service: HyDEService,
) -> list[HyDEResult]:
    """Generate HyDE embeddings for multiple concepts in parallel."""
    import asyncio

    tasks = [hyde_service.generate(concept) for concept in concepts]
    return await asyncio.gather(*tasks)

Overview

ScenarioUse HyDE?
Abstract/conceptual queriesYes
Exact term searchesNo (use keyword)
Code snippet searchesNo
Natural language questionsYes
Vocabulary mismatch suspectedYes

Fallback Strategy

async def hyde_with_fallback(
    query: str,
    hyde_service: HyDEService,
    embed_fn: callable,
    timeout: float = 3.0,
) -> list[float]:
    """HyDE with fallback to direct embedding on timeout."""
    import asyncio

    try:
        async with asyncio.timeout(timeout):
            result = await hyde_service.generate(query)
            return result.embedding
    except TimeoutError:
        # Fallback to direct query embedding
        return await embed_fn(query)

Performance Tips

  • Use fast model (gpt-4o-mini, claude-3-haiku) for generation
  • Cache aggressively (queries often repeat)
  • Set tight timeouts (2-3s) with fallback
  • Keep hypothetical docs concise (100-200 tokens)
  • Combine with query decomposition for best results
  • rag-retrieval - Core RAG patterns that HyDE enhances for better retrieval
  • embeddings - Embedding models used to embed hypothetical documents
  • query-decomposition - Complementary technique for multi-concept queries
  • semantic-caching - Cache HyDE results to avoid repeated LLM calls

Key Decisions

DecisionChoiceRationale
Generation modelgpt-4o-mini / claude-3-haikuFast and cheap for hypothetical doc generation
Temperature0.3Low temperature for consistent, factual hypothetical docs
Max tokens100-200Concise docs match embedding sweet spot
Timeout with fallback2-3 secondsGraceful degradation to direct query embedding

References

スコア

総合スコア

75/100

リポジトリの品質指標に基づく評価

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

+10
説明文

100文字以上の説明がある

+10
人気

GitHub Stars 100以上

0/15
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

0/5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

+5

レビュー

💬

レビュー機能は近日公開予定です