Back to list
aiskillstore

image-generator

by aiskillstore

Security-audited skills for Claude, Codex & Claude Code. One-click install, quality verified.

102🍴 3📅 Jan 23, 2026

SKILL.md


name: image-generator description: | Generate professional visuals using Gemini via browser automation with 6-gate quality control. Use when creating chapter illustrations, diagrams, or teaching visuals. NOT for stock photos or decorative images. dependencies:

  • browser-use

Image Generator

Generate professional teaching visuals using Gemini 3 with multi-turn reasoning partnership.

Quick Start

# 1. Start browser (via browser-use skill)
bash .claude/skills/browser-use/scripts/start-server.sh

# 2. Navigate to Gemini
# Use browser_navigate to https://gemini.google.com/

# 3. Generate image from creative brief
# Paste creative brief → Wait 30-35s → Verify 6 gates → Download

Core Principles

  1. Reasoning over prediction - Creative briefs (Story/Intent/Metaphor) activate reasoning; pixel specs don't
  2. Multi-turn partnership - Teach Gemini your standards through principle-based feedback
  3. 6-gate quality - Explicit pass/fail before download
  4. Autonomous batch - No permission-asking between visuals

Input: Creative Brief Format

Receive from visual-asset-workflow:

## The Story
[Narrative about what's visualized]

## Emotional Intent
[What it should FEEL like]

## Visual Metaphor
[Universal concept for instant comprehension]

## Subject / Composition / Action / Location / Style
[Gemini 3 prompt structure]

## Color Semantics
Blue (#2563eb) = Authority | Green (#10b981) = Execution

## Typography Hierarchy
Largest: Key insight | Medium: Supporting | Smallest: Context

Do NOT convert to pixel specs - use as-is to activate reasoning.

Workflow (Per Visual)

StepActionTool
1Navigate to gemini.google.combrowser_navigate
2Select "🍌 Create Image"browser_click
3Paste creative briefbrowser_type
4Wait 30-35 secondsbrowser_wait_for
5Verify 6 gates (below)Visual inspection
6If fail: Iterate with feedback (max 3)browser_type
7If pass: Download full sizebrowser_click
8Copy to apps/learn-app/static/img/part-{N}/chapter-{NN}/Bash
9Embed in lesson immediatelyEdit
10NEW CHAT for next visualbrowser_navigate

Quality Gates (ALL Must Pass)

GateCriterionFail Action
1. Spelling99% accuracy (Y-Combinator, Kubernetes)Iterate
2. LayoutProportions match prompt (2×2 not 3×1)Iterate
3. ColorBrand colors match (#2563eb not #002050)Iterate
4. TypographyLargest = key concept (not decoration)Iterate
5. Teaching<5 sec concept grasp at target proficiencyIterate
6. UniquenessNot duplicate of existing chapter imageNew chat

Decision: ALL pass → Download | ANY fail → Iterate (max 3 tries)

Iteration: Principle-Based Feedback

When gate fails, provide teaching feedback:

Gate 4 FAILED: Typography hierarchy incorrect

The largest text is "$100K" (supporting detail) but should be "$3T"
(key insight students must grasp).

Increase '$3T' to dominant size. Reduce '$100K' to supporting size.
Information importance drives sizing.

Batch Mode

When invoked with "generate all visuals":

For EACH visual in list:
  A. NEW CHAT (context isolation)
  B. Generate (paste brief)
  C. Verify 6 gates
  D. Iterate if needed (max 3)
  E. Download when pass
  F. Embed in lesson
  G. Log "✅ N/M"
  H. NEXT (no stopping)

Never ask: "Continue?" "Pause here?" "Review?"

Report at END only:

BATCH COMPLETE
✅ Generated: 16/18
⚠️ Deferred: 2 (quality issues)
Location: apps/learn-app/static/img/part-{N}/

Proficiency Limits

LevelMax ElementsGrasp Time
A25-7<5 sec
B17-10<10 sec
C2No limitN/A

Token Conservation (Batch Mode)

For >8 visuals, condense briefs:

Original (250 tokens):

"Top Layer shows Coordinator at center top with label 'Orchestrator'
featuring conductor icon, with role 'Strategic oversight'..."

Condensed (80 tokens):

"Top Layer - Coordinator: Center top, 'Orchestrator' (conductor),
Role: 'Strategic oversight', Gold (#fbbf24), Large hexagon."

Keep: Story, Intent, Metaphor, Colors, Reasoning Condense: Long examples → Short labels

Anti-Patterns

Don'tWhy
Accept first output without 6 gatesQuality standard violation
Ask permission between batch itemsBreaks autonomous agency
Convert briefs to pixel specsDefeats reasoning activation
Skip embedding stepCreates orphan images
Reuse same chat for next visualContext contamination

Session Interruption

If session ends mid-batch, create checkpoint:

# Checkpoint: Part {N}
Status: INTERRUPTED at 8/18

## Completed:
- ✅ Image 1: filename (embedded lesson-01.md)
- ✅ Image 2: filename (embedded lesson-02.md)

## Remaining:
- ⏳ Image 8: filename

On continuation: Read checkpoint → Resume → Update incrementally

Success Indicators

  • ✅ All 6 gates verified before download
  • ✅ Batch completion without permission-asking
  • ✅ Principle-based iteration feedback
  • ✅ Images organized by part/chapter
  • ✅ Immediate embedding (no orphans)
  • ✅ >85% production-ready rate

Score

Total Score

60/100

Based on repository quality metrics

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

0/10
説明文

100文字以上の説明がある

0/10
人気

GitHub Stars 100以上

+5
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

0/5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

+5

Reviews

💬

Reviews coming soon