← Back to list

ocr
by trpc-group
trpc-agent-go is a powerful Go framework for building intelligent agent systems using large language models (LLMs) and tools.
⭐ 832🍴 80📅 Jan 23, 2026
SKILL.md
name: ocr description: Extract text from images using Tesseract OCR
OCR Image Text Extraction Skill
Extract text from images using Tesseract OCR engine.
Capabilities
- Extract text from image files (PNG, JPG, JPEG, GIF, BMP, TIFF)
- Support for 100+ languages
- Optional image preprocessing for better accuracy
- Output in plain text or JSON format with confidence scores
Usage
Basic OCR
python3 scripts/ocr.py <image_file> <output_file>
With Options
# Specify language (default: eng)
python3 scripts/ocr.py image.png text.txt --lang eng
# Chinese text
python3 scripts/ocr.py image.png text.txt --lang chi_sim
# Multiple languages
python3 scripts/ocr.py image.png text.txt --lang eng+chi_sim
# With image preprocessing (improves accuracy)
python3 scripts/ocr.py image.png text.txt --preprocess
# JSON output with confidence scores
python3 scripts/ocr.py image.png output.json --format json
Download and OCR from URL
# OCR from remote image
python3 scripts/ocr_url.py <image_url> <output_file>
# With options
python3 scripts/ocr_url.py https://example.com/image.jpg text.txt --lang eng --preprocess
Parameters
image_file/image_url(required): Path to local image or image URLoutput_file(required): Path to output text/JSON file--lang: Language code (e.g., eng, chi_sim, jpn, fra, deu). Default: eng--preprocess: Apply image preprocessing (grayscale, thresholding) for better accuracy--format: Output format (text/json, default: text)
Common Languages
| Language | Code |
|---|---|
| English | eng |
| Chinese (Simplified) | chi_sim |
| Chinese (Traditional) | chi_tra |
| Japanese | jpn |
| Korean | kor |
| French | fra |
| German | deu |
| Spanish | spa |
| Russian | rus |
| Arabic | ara |
Supported Image Formats
PNG, JPG, JPEG, GIF, BMP, TIFF, WEBP
Dependencies
- Python 3.8+
- pytesseract
- Pillow (PIL)
- tesseract-ocr (system package)
Installation
# Python packages
pip install pytesseract Pillow
# Tesseract OCR engine
sudo apt-get install tesseract-ocr # Ubuntu/Debian
sudo yum install tesseract # CentOS/RHEL
brew install tesseract # macOS
Score
Total Score
90/100
Based on repository quality metrics
✓SKILL.md
SKILL.mdファイルが含まれている
+20
✓LICENSE
ライセンスが設定されている
+10
✓説明文
100文字以上の説明がある
+10
✓人気
GitHub Stars 500以上
+10
✓最近の活動
1ヶ月以内に更新
+10
✓フォーク
10回以上フォークされている
+5
✓Issue管理
オープンIssueが50未満
+5
✓言語
プログラミング言語が設定されている
+5
✓タグ
1つ以上のタグが設定されている
+5
Reviews
💬
Reviews coming soon
