rag-search

Name: rag-search
Rating: 65
Author: grc-iit

by grc-iit

An automated workflow that conducts AI-powered research, extracts and acquires academic papers, converts them into structured markdown, and creates a searchable vector database for RAG applications.

⭐ 0🍴 0📅 Jan 24, 2026

academic-paper papers rag

View on GitHub Run in Manus

SKILL.md

name: rag-search description: Search RAG database for relevant content. Use for semantic queries over processed documents, code, or papers.

RAG Search

This skill helps you search processed document databases using semantic similarity and retrieval-time optimizations.

Quick Search

# Basic vector search
uv run processor search ./lancedb "how does the caching work"

# Hybrid search (vector + keyword)
uv run processor search ./lancedb "ConfigParser yaml loading" --hybrid

# Search code
uv run processor search ./lancedb "authentication middleware" --table code_chunks

Available Tables

Table	Content
`text_chunks`	Documents, papers, markdown (default)
`code_chunks`	Source code
`image_chunks`	Figures from papers
`chunks`	Unified table (if created with --table-mode unified)

MCP Server

Start the RAG MCP server for programmatic access:

uv run rag-mcp

Generate a config template:

uv run rag-mcp --config_generate

Configure in Claude Desktop (claude_desktop_config.json):

{
  "mcpServers": {
    "rag": {
      "command": "uv",
      "args": ["run", "rag-mcp"],
      "cwd": "/path/to/processor"
    }
  }
}

Available MCP Tools

search - Vector/hybrid search with optimizations
search_images - Search image chunks
list_tables - List available tables
generate_config - Create config template

Retrieval Optimizations

Enable these for better results at the cost of latency:

Optimization	Flag	Latency	Best For
Hybrid Search	`hybrid=True`	+10-30ms	Keyword-heavy queries
HyDE	`use_hyde=True`	+200-500ms	Knowledge questions
Reranking	`rerank=True`	+50-200ms	High precision needs
Parent Expansion	`expand_parents=True`	+5-20ms	Broader context

Recommended Combinations

Fast search (default): No optimizations - pure vector similarity

Better recall:

search(query="...", hybrid=True)

Knowledge questions:

search(query="what is...", use_hyde=True, rerank=True)

Code search:

search(query="...", table="code_chunks", hybrid=True)

Maximum precision:

search(query="...", hybrid=True, use_hyde=True, rerank=True)

Configuration

Edit rag_config.yaml to set defaults:

# Embedding profiles (must match processor config)
text_profile: "low"
code_profile: "low"
ollama_host: "http://localhost:11434"

# Default search behavior
default_limit: 5
default_hybrid: false

# HyDE settings (uses Claude SDK by default, falls back to Ollama)
hyde:
  enabled: false
  backend: "claude_sdk"  # claude_sdk (default) or ollama
  claude_model: "haiku"  # haiku, sonnet, opus
  ollama_model: "llama3.2:latest"  # fallback

# Reranking settings
reranker:
  enabled: false
  model: "BAAI/bge-reranker-v2-m3"
  top_k: 20
  top_n: 5

Understanding Results

Results include:

content: Matched chunk text
source_file: Original file path
score: Similarity (0-1, higher is better)
metadata: Additional fields (section, language, etc.)

Troubleshooting

No results found

Check database exists: uv run processor stats ./lancedb
Verify table has data: --table text_chunks
Try broader query terms

Poor quality results

Enable hybrid search: --hybrid
Check embedding profiles match processor config
Consider reranking: rerank=True

Slow searches

Disable HyDE if not needed
Reduce rerank_top_k
Check Ollama server performance

Score

Total Score

65/100

Based on repository quality metrics

✓SKILL.md

SKILL.mdファイルが含まれている

+20

○LICENSE

ライセンスが設定されている

0/10

✓説明文

100文字以上の説明がある

+10

○人気

GitHub Stars 100以上

0/15

✓最近の活動

1ヶ月以内に更新

+10

○フォーク

10回以上フォークされている

0/5

✓Issue管理

オープンIssueが50未満

✓言語

プログラミング言語が設定されている

✓タグ

1つ以上のタグが設定されている

Reviews

💬

Reviews coming soon

rag-search

SKILL.md

name: rag-search description: Search RAG database for relevant content. Use for semantic queries over processed documents, code, or papers.

RAG Search

Quick Search

Available Tables

MCP Server

Available MCP Tools

Retrieval Optimizations

Recommended Combinations

Configuration

Understanding Results

Troubleshooting

No results found

Poor quality results

Slow searches

Score

Reviews

web-design-guidelines

skill-creator

orpc-contract-first

vercel-react-best-practices

frontend-code-review

frontend-testing

rag-search

SKILL.md

name: rag-search description: Search RAG database for relevant content. Use for semantic queries over processed documents, code, or papers.

RAG Search

Quick Search

Available Tables

MCP Server

Available MCP Tools

Retrieval Optimizations

Recommended Combinations

Configuration

Understanding Results

Troubleshooting

No results found

Poor quality results

Slow searches

Score

Reviews

Related

Related Skills

web-design-guidelines

skill-creator

orpc-contract-first

vercel-react-best-practices

frontend-code-review

frontend-testing