Back to list
yonatangross

prompt-engineering-suite

by yonatangross

The Complete AI Development Toolkit for Claude Code — 159 skills, 34 agents, 20 commands, 144 hooks. Production-ready patterns for FastAPI, React 19, LangGraph, security, and testing.

29🍴 4📅 Jan 23, 2026

SKILL.md


name: prompt-engineering-suite description: Comprehensive prompt engineering with Chain-of-Thought, few-shot learning, prompt versioning, and optimization. Use when designing prompts, improving accuracy, managing prompt lifecycle. version: 1.0.0 tags: [prompts, cot, few-shot, versioning, optimization, langfuse, dspy, 2026] context: fork agent: prompt-engineer author: OrchestKit user-invocable: false

Prompt Engineering Suite

Design, version, and optimize prompts for production LLM applications.

Overview

  • Designing prompts for new LLM features
  • Improving accuracy with Chain-of-Thought reasoning
  • Few-shot learning with example selection
  • Managing prompts in production (versioning, A/B testing)
  • Automatic prompt optimization with DSPy

Quick Reference

Chain-of-Thought Pattern

from langchain_core.prompts import ChatPromptTemplate

COT_SYSTEM = """You are a helpful assistant that solves problems step-by-step.

When solving problems:
1. Break down the problem into clear steps
2. Show your reasoning for each step
3. Verify your answer before responding
4. If uncertain, acknowledge limitations

Format your response as:
STEP 1: [description]
Reasoning: [your thought process]

STEP 2: [description]
Reasoning: [your thought process]

...

FINAL ANSWER: [your conclusion]"""

cot_prompt = ChatPromptTemplate.from_messages([
    ("system", COT_SYSTEM),
    ("human", "Problem: {problem}\n\nThink through this step-by-step."),
])

Few-Shot with Dynamic Examples

from langchain_core.prompts import FewShotChatMessagePromptTemplate

examples = [
    {"input": "What is 2+2?", "output": "4"},
    {"input": "What is the capital of France?", "output": "Paris"},
]

few_shot = FewShotChatMessagePromptTemplate(
    examples=examples,
    example_prompt=ChatPromptTemplate.from_messages([
        ("human", "{input}"),
        ("ai", "{output}"),
    ]),
)

final_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. Answer concisely."),
    few_shot,
    ("human", "{input}"),
])

Prompt Versioning with Langfuse SDK v3

from langfuse import Langfuse
# Note: Langfuse SDK v3 is OTEL-native (acquired by ClickHouse Jan 2026)

langfuse = Langfuse()

# Get versioned prompt with label
prompt = langfuse.get_prompt(
    name="customer-support-v2",
    label="production",  # production, staging, canary
    cache_ttl_seconds=300,
)

# Compile with variables
compiled = prompt.compile(
    customer_name="John",
    issue="billing question"
)

DSPy 3.1.0 Automatic Optimization

import dspy

class OptimizedQA(dspy.Module):
    def __init__(self):
        self.generate = dspy.Predict("question -> answer")

    def forward(self, question):
        return self.generate(question=question)

# Optimize with MIPROv2 (recommended) or BootstrapFewShot
optimizer = dspy.MIPROv2(metric=answer_match)  # Data+demo-aware Bayesian optimization
optimized = optimizer.compile(OptimizedQA(), trainset=examples)

# Alternative: GEPA (July 2025) - Reflective Prompt Evolution
# Uses model introspection to analyze failures and propose better prompts

Pattern Selection Guide

PatternWhen to UseExample Use Case
Zero-shotSimple, well-defined tasksClassification, extraction
Few-shotComplex tasks needing examplesFormat conversion, style matching
CoTReasoning, math, logicProblem solving, analysis
Zero-shot CoTQuick reasoning boostAdd "Let's think step by step"
ReActTool use, multi-stepAgent tasks, API calls
StructuredJSON/schema outputData extraction, API responses

Key Decisions

DecisionRecommendation
Few-shot examples3-5 diverse, representative examples
Example orderingMost similar examples last (recency bias)
CoT trigger"Let's think step by step" or explicit format
Prompt versioningLangfuse with labels (production/staging)
A/B testing50+ samples, track via trace metadata
Auto-optimizationDSPy BootstrapFewShot for few-shot tuning

Anti-Patterns (FORBIDDEN)

# NEVER hardcode prompts without versioning
PROMPT = "You are a helpful assistant..."  # No version control!

# NEVER use single example for few-shot
examples = [{"input": "x", "output": "y"}]  # Too few!

# NEVER skip CoT for complex reasoning
response = llm.complete("Solve: 15% of 240")  # No reasoning!

# ALWAYS version prompts
prompt = langfuse.get_prompt("assistant", label="production")

# ALWAYS use 3-5 diverse examples
examples = [ex1, ex2, ex3, ex4, ex5]

# ALWAYS use CoT for math/logic
response = llm.complete("Solve: 15% of 240. Think step by step.")

Detailed Documentation

ResourceDescription
references/chain-of-thought.mdCoT patterns, zero-shot CoT, self-consistency
references/few-shot-patterns.mdExample selection, ordering, formatting
references/prompt-versioning.mdLangfuse integration, A/B testing
references/prompt-optimization.mdDSPy, automatic tuning, evaluation
scripts/cot-template.pyFull Chain-of-Thought implementation
scripts/few-shot-template.pyFew-shot with dynamic example selection
scripts/jinja2-prompts.pyJinja2 templates (2026): async, caching, LLM filters, Anthropic format
  • langfuse-observability - Prompt management and A/B testing tracking
  • llm-evaluation - Evaluating prompt effectiveness
  • function-calling - Structured output patterns
  • llm-testing - Testing prompt variations

Capability Details

chain-of-thought

Keywords: CoT, step by step, reasoning, think, chain of thought Solves:

  • Improve accuracy on complex reasoning tasks
  • Debug LLM reasoning process
  • Implement self-consistency with multiple CoT paths

few-shot-learning

Keywords: few-shot, examples, in-context learning, demonstrations Solves:

  • Format LLM output with examples
  • Handle complex tasks without fine-tuning
  • Select optimal examples for task

prompt-versioning

Keywords: version, prompt management, A/B test, production prompt Solves:

  • Manage prompts in production
  • A/B test prompt variations
  • Roll back to previous versions

prompt-optimization

Keywords: DSPy, optimize, tune, automatic prompt, OPRO Solves:

  • Automatically optimize prompts
  • Find best few-shot examples
  • Improve accuracy without manual tuning

zero-shot-cot

Keywords: zero-shot CoT, think step by step, reasoning trigger Solves:

  • Quick reasoning boost without examples
  • Add "Let's think step by step" trigger
  • Improve accuracy on math/logic

self-consistency

Keywords: self-consistency, multiple paths, voting, ensemble Solves:

  • Generate multiple reasoning paths
  • Vote on most common answer
  • Improve reliability on hard problems

Score

Total Score

75/100

Based on repository quality metrics

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

+10
説明文

100文字以上の説明がある

+10
人気

GitHub Stars 100以上

0/15
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

0/5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

+5

Reviews

💬

Reviews coming soon