← スキル一覧に戻る

observability-setup
by patricio0312rev
Comprehensive library of +100 production-ready development skills covering every aspect of modern software engineering. From project setup to production deployment, from security hardening to performance optimization.
⭐ 6🍴 0📅 2026年1月19日
SKILL.md
name: observability-setup description: Implements comprehensive observability with OpenTelemetry tracing, Prometheus metrics, and structured logging. Includes instrumentation plans, sample dashboards, and alert candidates. Use for "observability", "monitoring", "tracing", or "metrics".
Observability Setup
Implement the three pillars: Traces, Metrics, and Logs.
OpenTelemetry Tracing
// tracing.ts
import { NodeTracerProvider } from "@opentelemetry/sdk-trace-node";
import { Resource } from "@opentelemetry/resources";
import { SemanticResourceAttributes } from "@opentelemetry/semantic-conventions";
import { registerInstrumentations } from "@opentelemetry/instrumentation";
import { HttpInstrumentation } from "@opentelemetry/instrumentation-http";
import { ExpressInstrumentation } from "@opentelemetry/instrumentation-express";
import { PrismaInstrumentation } from "@prisma/instrumentation";
const provider = new NodeTracerProvider({
resource: new Resource({
[SemanticResourceAttributes.SERVICE_NAME]: "my-api",
[SemanticResourceAttributes.SERVICE_VERSION]: "1.0.0",
}),
});
registerInstrumentations({
instrumentations: [
new HttpInstrumentation(),
new ExpressInstrumentation(),
new PrismaInstrumentation(),
],
});
provider.register();
// Custom spans
import { trace } from "@opentelemetry/api";
const tracer = trace.getTracer("my-app");
async function processOrder(orderId: string) {
const span = tracer.startSpan("processOrder");
span.setAttribute("order.id", orderId);
try {
await validateOrder(orderId);
await chargePayment(orderId);
await fulfillOrder(orderId);
span.setStatus({ code: 0 }); // OK
} catch (error) {
span.setStatus({ code: 2, message: error.message }); // ERROR
throw error;
} finally {
span.end();
}
}
Prometheus Metrics
// metrics.ts
import { Registry, Counter, Histogram, Gauge } from "prom-client";
const register = new Registry();
// HTTP request counter
export const httpRequestCounter = new Counter({
name: "http_requests_total",
help: "Total HTTP requests",
labelNames: ["method", "route", "status_code"],
registers: [register],
});
// HTTP request duration
export const httpRequestDuration = new Histogram({
name: "http_request_duration_seconds",
help: "HTTP request duration in seconds",
labelNames: ["method", "route", "status_code"],
buckets: [0.1, 0.5, 1, 2, 5, 10],
registers: [register],
});
// Active connections
export const activeConnections = new Gauge({
name: "active_connections",
help: "Number of active connections",
registers: [register],
});
// Business metrics
export const ordersProcessed = new Counter({
name: "orders_processed_total",
help: "Total orders processed",
labelNames: ["status"],
registers: [register],
});
// Middleware
app.use((req, res, next) => {
const start = Date.now();
res.on("finish", () => {
const duration = (Date.now() - start) / 1000;
const route = req.route?.path || "unknown";
httpRequestCounter.inc({
method: req.method,
route,
status_code: res.statusCode,
});
httpRequestDuration.observe(
{ method: req.method, route, status_code: res.statusCode },
duration
);
});
next();
});
// Metrics endpoint
app.get("/metrics", async (req, res) => {
res.set("Content-Type", register.contentType);
res.end(await register.metrics());
});
Structured Logging
// logger.ts
import pino from "pino";
export const logger = pino({
level: process.env.LOG_LEVEL || "info",
formatters: {
level: (label) => ({ level: label }),
},
base: {
service: "my-api",
environment: process.env.NODE_ENV,
},
});
// Usage
logger.info({ userId: "123", action: "login" }, "User logged in");
logger.error({ err: error, orderId: "456" }, "Order processing failed");
Sample Dashboard (Grafana)
{
"dashboard": {
"title": "API Overview",
"panels": [
{
"title": "Request Rate",
"targets": [{
"expr": "rate(http_requests_total[5m])"
}]
},
{
"title": "Error Rate",
"targets": [{
"expr": "rate(http_requests_total{status_code=~"5.."}[5m])"
}]
},
{
"title": "p95 Latency",
"targets": [{
"expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))"
}]
},
{
"title": "Active Connections",
"targets": [{
"expr": "active_connections"
}]
}
]
}
}
Alert Candidates
# alerts.yml
groups:
- name: api_alerts
interval: 30s
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status_code=~"5.."}[5m]) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate detected"
- alert: HighLatency
expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 2
for: 10m
labels:
severity: warning
annotations:
summary: "p95 latency above 2s"
- alert: LowAvailability
expr: rate(http_requests_total{status_code="200"}[5m]) / rate(http_requests_total[5m]) < 0.95
for: 5m
labels:
severity: critical
annotations:
summary: "Availability below 95%"
Output Checklist
- OpenTelemetry tracing configured
- Prometheus metrics instrumented
- Structured logging implemented
- Sample dashboards created
- Alert rules defined
- Metrics endpoint exposed
- Instrumentation tested ENDFILE
スコア
総合スコア
70/100
リポジトリの品質指標に基づく評価
✓SKILL.md
SKILL.mdファイルが含まれている
+20
✓LICENSE
ライセンスが設定されている
+10
✓説明文
100文字以上の説明がある
+10
○人気
GitHub Stars 100以上
0/15
✓最近の活動
3ヶ月以内に更新
+5
○フォーク
10回以上フォークされている
0/5
✓Issue管理
オープンIssueが50未満
+5
○言語
プログラミング言語が設定されている
0/5
✓タグ
1つ以上のタグが設定されている
+5
レビュー
💬
レビュー機能は近日公開予定です


