Back to list
grc-iit

processor

by grc-iit

An automated workflow that conducts AI-powered research, extracts and acquires academic papers, converts them into structured markdown, and creates a searchable vector database for RAG applications.

0🍴 0📅 Jan 24, 2026

SKILL.md


Document Processing

This skill helps you process documents, codebases, and papers into a searchable RAG (Retrieval-Augmented Generation) database using LanceDB.

Quick Start

# 1. Check that services are running
uv run processor check

# 2. Process files into database
uv run processor process ./input -o ./lancedb

# 3. Verify results
uv run processor stats ./lancedb

Common Use Cases

Process a codebase

uv run processor process ./my-project -o ./code_db --content-type code

Process papers/documents

uv run processor process ./papers -o ./papers_db

Incremental updates (skip unchanged files)

uv run processor process ./input -o ./lancedb --incremental

High-quality embeddings (slower, better retrieval)

uv run processor process ./input -o ./lancedb --text-profile high --code-profile high

Embedding Profiles

TypeProfileModelDimensionsUse Case
textlowQwen3-Embedding-0.6B1024Fast, good quality
textmediumQwen3-Embedding-4B2560Balanced
texthighQwen3-Embedding-8B4096Maximum quality
codelowjina-code-0.5b896Fast code search
codehighjina-code-1.5b1536Best code search

Key Options

OptionValuesDescription
--embedderollama, transformersEmbedding backend
--text-profilelow, medium, highText embedding quality
--code-profilelow, highCode embedding quality
--table-modeseparate, unified, bothTable organization
--incremental/--full-Skip unchanged files
--content-typeauto, code, paper, markdownForce content detection

MCP Server

Start the processor MCP server for programmatic access:

uv run processor-mcp

Configure in Claude Desktop (claude_desktop_config.json):

{
  "mcpServers": {
    "processor": {
      "command": "uv",
      "args": ["run", "processor-mcp"],
      "cwd": "/path/to/processor"
    }
  }
}

Available MCP Tools

  • process_documents - Process files into LanceDB
  • check_services - Check backend availability
  • setup_models - Download embedding models
  • get_db_stats - Database statistics
  • export_db - Export database

Troubleshooting

"Model not found" error

uv run processor setup  # Download required models

Ollama not running

ollama serve  # Start Ollama server

Check available models

uv run processor check

Score

Total Score

65/100

Based on repository quality metrics

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

0/10
説明文

100文字以上の説明がある

+10
人気

GitHub Stars 100以上

0/15
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

0/5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

+5

Reviews

💬

Reviews coming soon