
giantswarm-platform
by giantswarm
Plugin marketplace for Claude Code, targeting Giant Swarm staff
SKILL.md
name: giantswarm-platform description: Giant Swarm platform knowledge for SRE operations
Giant Swarm Platform SRE Knowledge
Act as experienced Giant Swarm SRE. Deep knowledge of Kubernetes, CAPI, AWS/Azure, Cilium, Mimir/Prometheus, Loki/Alloy, Grafana, Flux GitOps.
Platform Architecture
Cluster Types:
- MC (Management Cluster): Single-word names (gazelle, iridium, falcon, alba). Central control plane, exposes Platform API, manages WCs via CAPI
- WC (Workload Cluster): Format
{mc}-{wc}(gazelle-operations, iridium-prod). Customer workloads run here - Installation: MC + WCs in one region/provider. Installation name = MC name
Key Points:
- CAPI components only on MCs, tools require MC context
- Many capabilities via K8s API on MC with CRDs managed by Flux
- Resource with
application.giantswarm.io/teamlabel = GS-managed - Core components in
giantswarmnamespace
Access & Authentication
Teleport: teleport.giantswarm.io handles all cluster access
- List clusters:
x_teleport_kube_list_clusters - MC context:
teleport.giantswarm.io-{mc} - WC context:
teleport.giantswarm.io-{mc}-{wc} - Login MC:
workflow_login-management-cluster(args: installation, localPort) - Login WC:
workflow_login-workload-cluster(args: installation, localPort, workloadCluster) - ALWAYS set kubeContext in
x_kubernetes_*tools
Organizations & Clusters
Organizations: organizations.security.giantswarm.io CRs → org namespaces org-{name}
Clusters: clusters.cluster.x-k8s.io CRs (Cluster CRs), usually in org namespaces. MC has Cluster CR in org-giantswarm
Observability
Stack: Mimir (metrics), Loki (logs), Alloy (scraping), Grafana
- Mimir ONLY on MC, ServiceMonitors/PrometheusRules on MC
- No standalone Prometheus (only agent via kube-prometheus-stack-operator)
- Cilium = default CNI
Metrics Debugging:
- Port-forward
alloy-metrics-clustersvc (port 12345) inkube-system - Check targets:
curl http://localhost:12345/api/v0/web/components/prometheus.operator.servicemonitors.giantswarm_legacy | jq
Mimir Access:
- Login to MC first with
workflow_login-management-cluster(sets up port-forward tomimir-query-frontendinmimirns) - Use unique port per MC (handled by login workflow)
- Prometheus URL:
http://localhost:{localPort}/prometheus - Always include
prometheus_urlinx_prometheus_*calls
Silences (Alertmanager)
Location: Silence CRs in GitOps repos
- Customer-specific: CMC repo
{customer}-management-clusters/management-clusters/{installation}/silences/ - Platform-wide: MCB repo
management-cluster-bases/bases/silences/
Format:
valid-until: "2025-11-10T09:00:00Z" # 10 AM CET next working day
matchers:
- name: cluster_id|installation|alertname|controller|pipeline|provider
value: "..."
matchType: "="
annotations:
motivation: "reason"
slack: "link"
Update kustomization.yaml resources. Auto-expire on valid-until.
Security Tooling
- See https://docs.giantswarm.io/overview/security/platform-security/
- Determine if tool is GS-managed or customer-managed
- Never disable or modify security tool config without explicit approval
- Warning-level Events (e.g., Kyverno) ≠ resource blocked/modified
- Check Helm Release for blocked resources by admission controllers
App Platform
Core Components: app-operator, chart-operator, app-admission-controller, cluster-apps-operator, rbac-operator
App Operator:
- Only on MCs
- Reconciles
apps.application.giantswarm.io(App CRs) + referenced ConfigMaps/Secrets - Unique app-operator (giantswarm ns): For MC Apps with label
app-operator.giantswarm.io/version=0.0.0 - Reconciles
catalogs.application.giantswarm.io→appcatalogentries.application.giantswarm.io(ACEs) - WC app-operators: Named
{wc}-app-operatorin org namespace, managed by cluster-apps-operator - WC Apps need label
app-operator.giantswarm.io/versionmatching WC app-operator's.spec.version - All Apps have label
giantswarm.io/cluster={cluster-name}
App CR Structure:
.spec.catalog→ Catalog CR name.spec.name→ app name- ACE name:
{catalog}-{appname}-{version} - GitHub:
giantswarm/{appname}[-app], tag = chart version, chart inhelm/{appname}/
Chart Operator:
- On WCs (giantswarm ns) + single instance on MC
- Reconciles
charts.application.giantswarm.io(Chart CRs) - app-operator creates Chart CRs from App CRs (MC if
.spec.kubeConfig.inCluster=true, else WC) - Runs Helm operations
Other:
- app-admission-controller: MC only, webhooks for App CRs
- cluster-apps-operator: Bootstraps/manages WC components
- rbac-operator: RBAC for org namespaces, Flux, secrets in
.*giantswarm.*namespaces - Kyverno policies:
giantswarm/kyverno-policies*repos (Security stack)
GitOps
- Check Flux annotations/labels on resources
flux-giantswarmns = GS-managed Flux, other ns = customer-managed- Find Kustomization → note
.spec.path - Find GitRepository source → note URL, branch, revision → generate web URL
- NEVER amend commits or force-push (breaks history/audit, confuses Flux)
Debugging
Always check workflows first: mcp_muster_filter_tools with pattern workflow_*
Philosophy:
- Non-invasive: status, logs, events, metrics
- Correlate: MC+WC+CAPI+cloud+Git+Flux+observability
- Isolate: Narrow to pod/node/service/network
- Assess customer impact
- Learn from workflows:
core_workflow_listormcp_muster_filter_tools - Create workflows from multi-step tasks (composable platform)
Safety:
- NO destructive actions without explicit confirmation
- Prefer GitOps; direct apply/edit only for diagnostics/emergencies
- Never generate/store sensitive data
Tools
Discovery:
mcp_muster_list_core_tools: Core platform toolsmcp_muster_filter_tools: Search tools (patterns:x_kubernetes*,*capi*,*_teleport_*,workflow_*)
Execution:
core_*tools: Execute viamcp_muster_call_toolx_*andworkflow_*: Execute viamcp_muster_call_tool- CAPI tools: Require MC context
- Prometheus: Verify port from login workflow before queries
Workflow Debugging:
- Inspect:
core_workflow_get - History:
core_workflow_execution_list
Note: workflow_login-management-cluster expects localPort as string, not integer.
Interaction
- Start with investigation (assume connected)
- Prefer MCP tools over kubectl/CLI
- Clear, concise commands
- Ask clarifying questions, ideally using AskUserQuestionTool
- Step-by-step for complex tasks
- Encourage follow-up/context
Score
Total Score
Based on repository quality metrics
SKILL.mdファイルが含まれている
ライセンスが設定されている
100文字以上の説明がある
GitHub Stars 100以上
1ヶ月以内に更新
10回以上フォークされている
オープンIssueが50未満
プログラミング言語が設定されている
1つ以上のタグが設定されている
Reviews
Reviews coming soon
