← スキル一覧に戻る
gemini-image
tyrchen / geektime-bootcamp-ai
⭐ 100🍴 62📅 2026年1月19日
Reference guide for using google-genai Python library to generate images with gemini-3-pro-image-preview model. Use this skill when building new projects that need Gemini image generation capabilities, to understand the correct API patterns, configuration options, and best practices.
SKILL.md
---
name: gemini-image
description: Reference guide for using google-genai Python library to generate images with gemini-3-pro-image-preview model. Use this skill when building new projects that need Gemini image generation capabilities, to understand the correct API patterns, configuration options, and best practices.
---
# Gemini Image Generation Guide
Reference for generating images with Google's `gemini-3-pro-image-preview` model.
## Language References
Load the appropriate reference based on the project's language:
| Language | Reference File |
|----------|------------------------|
| Python | `references/python.md` |
**Instructions:** When implementing Gemini image generation, read the corresponding language reference file for complete code patterns and examples.
---
## Model Information
| Property | Value |
|----------------------|------------------------------|
| Model ID | `gemini-3-pro-image-preview` |
| Cost | ~$0.134 per image (2K) |
| Max Reference Images | 5+ (high fidelity) |
| Resolutions | 1K, 2K, 4K |
## Supported Aspect Ratios
| Ratio | Use Case |
|--------|----------------------------|
| `1:1` | Square, social media posts |
| `2:3` | Portrait photos |
| `3:2` | Landscape photos |
| `3:4` | Portrait, mobile screens |
| `4:3` | Standard display |
| `4:5` | Instagram portrait |
| `5:4` | Large format |
| `9:16` | Vertical video, stories |
| `16:9` | Widescreen, presentations |
| `21:9` | Ultra-wide, cinematic |
## Image Sizes
| Size | Resolution | Use Case |
|------|------------|-------------------------------|
| `1K` | ~1024px | Thumbnails, previews |
| `2K` | ~2048px | Standard output (recommended) |
| `4K` | ~4096px | High-quality prints |
**Important:** Use uppercase "K" (not "1k", "2k", "4k").
---
## Environment Setup
```bash
export GOOGLE_API_KEY='your-api-key-here'
```
---
## Core Capabilities
### 1. Text-to-Image Generation
Generate images from text descriptions with configurable aspect ratio and resolution.
### 2. Style Transfer with Reference Images
Pass reference images to maintain consistent style across generations. Supports up to 5+ images for high fidelity.
### 3. Image Editing
Modify existing images based on text instructions (add/remove elements, style changes).
### 4. Batch Generation
Generate multiple style candidates or variations.
---
## Prompt Engineering Tips
### Be Descriptive
```
Bad: "cat, sunset"
Good: "A fluffy orange tabby cat sitting on a wooden fence,
watching a vibrant sunset over rolling hills.
Warm golden and pink light illuminates the scene.
Photorealistic style with soft focus background."
```
### Specify Visual Elements
- **Lighting:** "soft morning light", "dramatic side lighting", "golden hour"
- **Style:** "oil painting", "watercolor", "3D render", "photorealistic"
- **Mood:** "serene", "dramatic", "whimsical", "mysterious"
- **Composition:** "close-up portrait", "wide landscape", "bird's eye view"
- **Camera:** "35mm lens", "shallow depth of field", "wide angle"
### For Style Transfer
When using reference images, be explicit about what to transfer:
- "Match the color palette and brushstroke style of the reference"
- "Keep the artistic mood and lighting from the reference image"
---
## Common Issues
| Issue | Solution |
|----------------------|-------------------------------------------------------------|
| "No image generated" | Check prompt for content policy violations; simplify prompt |
| "Invalid image_size" | Use uppercase: `"1K"`, `"2K"`, `"4K"` |
| "API key not found" | Set `GOOGLE_API_KEY` environment variable |
| Rate limits | Add delays between requests; use exponential backoff |
---
## Pricing Comparison
| Model | Cost per Image |
|----------------------------|----------------|
| gemini-3-pro-image-preview | ~$0.134 |
| gemini-2.5-flash-image | ~$0.039 |