pdf-processing-pro

Name: pdf-processing-pro
Rating: 60
Author: Crumbgrabber

by Crumbgrabber

In the interest of data sovereignty and avoiding vendor lock in, a template repository with all of our favorite prmopts, skills, agents, etc but scrupulously avoiding aspects that lock you in to one particular vendor. The one exception is patterns, which comes from fabric which contains "patterns" These are similar to both a skill and a sub agent

⭐ 1🍴 1📅 Dec 28, 2025

agents claude-code codex llms patterns prompts prompts-template skills

View on GitHub Run in Manus

SKILL.md

name: pdf-processing-pro description: Production-ready PDF processing with forms, tables, OCR, validation, and batch operations. Use when working with complex PDF workflows in production environments, processing large volumes of PDFs, or requiring robust error handling and validation.

PDF Processing Pro

Production-ready PDF processing guidance with comprehensive error handling and support for complex workflows (forms, tables, OCR, batch operations).

Core patterns (tool-agnostic)

Text extraction: Use robust libraries (e.g., pdfplumber) and validate output per page.
Form workflows: Detect fields, validate data against schemas, then fill; revalidate outputs.
Table extraction: Combine multiple extractors and normalize columns; handle merged cells explicitly.
OCR for scanned PDFs: Preprocess pages (deskew/denoise) before OCR; store page-level confidence and flag low-confidence regions.
Batch operations: Process PDFs page-by-page to control memory; log per-file successes/failures with timestamps and error details.

Reliability practices

Wrap PDF operations in try/except; include filename, operation, and stack trace in logs.
Validate inputs (paths, expected fields, schemas) before processing.
Use structured logging with timestamps, operation, result, duration.
Never log sensitive PDF contents; sanitize before sending to external services.
For large jobs, checkpoint outputs (per-file artifacts) so you can resume after failures.

Troubleshooting tips

File not found/invalid: Confirm path/permissions; verify file is not encrypted/corrupted.
OCR quality issues: Increase DPI during rasterization, deskew, and denoise before OCR.
Table extraction errors: Try alternate parsers/settings; manually define column boundaries when automated detection fails.
Memory issues: Stream pages instead of loading whole documents; free resources after each file.

Score

Total Score

60/100

Based on repository quality metrics

✓SKILL.md

SKILL.mdファイルが含まれている

+20

○LICENSE

ライセンスが設定されている

0/10

✓説明文

100文字以上の説明がある

+10

○人気

GitHub Stars 100以上

0/15

○最近の活動

3ヶ月以内に更新がある

0/10

○フォーク

10回以上フォークされている

0/5

✓Issue管理

オープンIssueが50未満

○言語

プログラミング言語が設定されている

0/5

✓タグ

1つ以上のタグが設定されている

Reviews

💬

Reviews coming soon

pdf-processing-pro

SKILL.md

name: pdf-processing-pro description: Production-ready PDF processing with forms, tables, OCR, validation, and batch operations. Use when working with complex PDF workflows in production environments, processing large volumes of PDFs, or requiring robust error handling and validation.

PDF Processing Pro

Core patterns (tool-agnostic)

Reliability practices

Troubleshooting tips

Score

Reviews

changelog-automation

web-component-design

dbt-transformation-patterns

market-sizing-analysis

on-call-handoff-patterns

architecture-decision-records

pdf-processing-pro

SKILL.md

name: pdf-processing-pro description: Production-ready PDF processing with forms, tables, OCR, validation, and batch operations. Use when working with complex PDF workflows in production environments, processing large volumes of PDFs, or requiring robust error handling and validation.

PDF Processing Pro

Core patterns (tool-agnostic)

Reliability practices

Troubleshooting tips

Score

Reviews

Related

Related Skills

changelog-automation

web-component-design

dbt-transformation-patterns

market-sizing-analysis

on-call-handoff-patterns

architecture-decision-records