Back to list
Obsidian-Owl

helm-k8s-deployment

by Obsidian-Owl

The Open Platform for building Data Platforms. Ship faster. Stay compliant. Scale to Data Mesh.

0🍴 11📅 Jan 24, 2026

SKILL.md


name: helm-k8s-deployment description: ALWAYS USE when working with Helm charts, Kubernetes deployments, kubectl commands, pod debugging, or container logs. Provides context-efficient strategies for chart development, K8s troubleshooting, and log analysis. MUST be loaded before any Helm or K8s work. allowed-tools: Read, Grep, Glob, Bash, WebSearch

Helm & Kubernetes Deployment (Research-Driven)

Philosophy

This skill does NOT dump logs into context. Instead, it guides you to:

  1. Research the current Helm/K8s state efficiently
  2. Extract only relevant log lines (not full logs)
  3. Diagnose issues with targeted commands
  4. Preserve context by summarising rather than copying

CRITICAL: Context-Efficient Log Analysis

NEVER dump full logs into context. Instead:

# ✅ GOOD: Get last 20 lines with errors only
kubectl logs <pod> --tail=20 2>&1 | grep -i "error\|fail\|exception"

# ✅ GOOD: Get events (more useful than logs for debugging)
kubectl get events --sort-by='.lastTimestamp' | tail -20

# ✅ GOOD: Check pod status first (often enough)
kubectl get pods -o wide

# ❌ BAD: Full log dump (burns context)
kubectl logs <pod>

Pre-Implementation Research Protocol

Step 1: Verify Cluster State

ALWAYS run this first (small output, high signal):

# Quick cluster health check
kubectl cluster-info 2>&1 | head -5

# Check namespace pods status
kubectl get pods -n <namespace> -o wide

# Recent events (usually reveals issues)
kubectl get events --sort-by='.lastTimestamp' -n <namespace> | tail -15

Step 2: Helm Chart Validation (Before Deploy)

# Lint chart
helm lint charts/<chart-name>

# Dry-run template rendering
helm template charts/<chart-name> --debug 2>&1 | head -100

# Validate manifests
helm template charts/<chart-name> | kubectl apply --dry-run=client -f -

Step 3: Targeted Debugging (Context-Efficient)

For pod issues, use this escalation:

  1. Status check (no logs needed):

    kubectl describe pod <pod> | grep -A 20 "Events:"
    
  2. Recent logs only:

    kubectl logs <pod> --tail=30 --since=5m
    
  3. Error extraction:

    kubectl logs <pod> 2>&1 | grep -i "error\|exception\|fatal" | tail -20
    
  4. Container-specific (for multi-container pods):

    kubectl logs <pod> -c <container> --tail=20
    

Floe-Runtime Chart Structure

charts/
├── floe-runtime/       # Umbrella chart
├── floe-dagster/       # Dagster webserver/daemon
├── floe-cube/          # Cube semantic layer
└── floe-infrastructure/ # PostgreSQL, MinIO, Polaris

Common Debugging Patterns

SymptomFirst CommandNot Full Logs
Pod CrashLoopBackOffkubectl describe pod <x> | grep -A10 EventsDon't dump logs
Pod Pendingkubectl describe pod <x> | grep -A5 ConditionsCheck resources
ImagePullBackOffkubectl describe pod <x> | grep -A3 WarningCheck image name
Service not reachablekubectl get endpoints <svc>Check selectors
Helm install failshelm install --debug --dry-run 2>&1 | tail -50Don't dump all

Context Injection (For Subagent Delegation)

When spawning the docker-log-analyser agent:

Analyse logs for [pod-name] focusing on:
- Startup failures
- Connection errors to [service]
- Specific error: [paste only the error line, not full log]

Return ONLY:
1. Root cause (1-2 sentences)
2. Suggested fix
3. Commands to verify fix

Quick Reference: Common Research Queries

WebSearch patterns (use when unfamiliar):

  • "Helm [chart-name] values.yaml reference 2025"
  • "Kubernetes [error-message] troubleshooting"
  • "Dagster Helm chart configuration 2025"

Integration with Floe Skills

When working on...Also consider...
Dagster deploymentdagster-skill (for asset config)
Cube deploymentcube-skill (for API endpoints)
Polaris in K8spolaris-skill (for catalog config)

Summary: Context Preservation Rules

  1. Never dump full logs — extract error lines only
  2. Use kubectl describe before kubectl logs
  3. Use --tail=N on all log commands
  4. Delegate to docker-log-analyser agent for deep analysis
  5. Summarise findings rather than pasting output

Score

Total Score

70/100

Based on repository quality metrics

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

+10
説明文

100文字以上の説明がある

0/10
人気

GitHub Stars 100以上

0/15
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

+5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

+5

Reviews

💬

Reviews coming soon