Back to list
carmentacollective

production-observability

by carmentacollective

A heart-centered AI for creating at the speed of thought

2🍴 1📅 Jan 23, 2026

SKILL.md


name: production-observability

prettier-ignore

description: Use when adding logging, error monitoring, metrics, Sentry, debugging production issues, or improving observability version: 1.0.0

Structured Logging

Log with context that helps debug issues. Use structured JSON logging with consistent fields across the application.

// Include relevant context in every log logger.info( { userId, action: 'subscription_created', subscriptionType, paymentMethod, amount, }, 'User subscribed successfully' );

logger.error( { error, userId, operation: 'process_payment', paymentProvider: 'stripe', attemptCount, }, 'Payment processing failed' );

logger.warn( { userId, resourceType: 'api_request', endpoint: '/api/data', responseTime: 2500, threshold: 1000, }, 'Slow API response detected' );

requestLogger.info('Processing request'); // All subsequent logs include requestId + userEmail automatically requestLogger.debug({ queryParams }, 'Executing database query'); requestLogger.info({ duration }, 'Request completed');

Error Monitoring

Capture exceptions with rich context that enables debugging. Use Sentry or similar tools to track errors in production.

try { await criticalOperation(); } catch (error) { logger.error({ error, userId, operation }, 'Critical operation failed');

Sentry.captureException(error, { tags: { operation: 'payment_processing', provider: 'stripe', }, extra: { userId, subscriptionId, attemptCount, paymentMethod, }, level: 'error', });

throw error; // Let error bubble to boundary }

// Later when error occurs, breadcrumbs show the sequence

// All subsequent errors include user context automatically

Performance Monitoring

Track operation timing to catch performance degradation early.

// Trace important operations const result = await Sentry.startSpan( { op: 'http.client', name: ${method} ${url}, }, async (span) => { span.setAttribute('http.method', method); span.setAttribute('http.url', url); span.setAttribute('user.id', userId);

const response = await fetch(url, options);

span.setAttribute('http.status_code', response.status);
span.setStatus({ code: 1, message: 'Success' });

return response;

} );

if (queryDuration > 1000) { logger.warn( { query: sanitizedQuery, duration: queryDuration, threshold: 1000, }, 'Slow query detected' ); }

Health Checks

Expose endpoints that verify system health and dependencies.

const results = { status: checks.every(c => c.status === 'fulfilled') ? 'healthy' : 'degraded', timestamp: new Date().toISOString(), checks: { database: checks[0].status === 'fulfilled' ? 'ok' : 'failed', redis: checks[1].status === 'fulfilled' ? 'ok' : 'failed', externalAPI: checks[2].status === 'fulfilled' ? 'ok' : 'failed', }, };

logger.info({ health: results }, 'Health check completed');

return Response.json(results, { status: results.status === 'healthy' ? 200 : 503, }); }

Metrics and Monitoring

Track key business and system metrics that indicate application health.

// Track business metrics Sentry.metrics.increment('user.signup', 1, { tags: { source: 'google_oauth', plan: 'free' }, });

Sentry.metrics.distribution('api.response_time', responseTime, { tags: { endpoint: '/api/chat', method: 'POST' }, unit: 'millisecond', });

Sentry.metrics.gauge('active_connections', connectionCount, { tags: { service: 'websocket' }, });

Error Boundaries

Let errors bubble to boundaries where they can be handled appropriately. Don't silently catch and hide errors.

Sentry.captureException(error, {
  tags: { endpoint: '/api/process' },
});

if (error instanceof ValidationError) {
  return Response.json(
    { error: error.message },
    { status: 400 }
  );
}

if (error instanceof NotFoundError) {
  return Response.json(
    { error: error.message },
    { status: 404 }
  );
}

return Response.json(
  { error: 'Internal server error' },
  { status: 500 }
);

} }

Correlation IDs

Track requests across service boundaries using correlation IDs.

const requestLogger = logger.child({ correlationId });

// Pass correlation ID to downstream services const response = await fetch(upstreamService, { headers: { 'X-Correlation-ID': correlationId, }, });

// All logs include correlationId, making distributed tracing possible requestLogger.info({ service: 'upstream' }, 'Called upstream service');

Monitor these signals to catch production issues early:

Error rates - Track 4xx and 5xx errors. Sudden spikes indicate problems.

Response times - P50, P95, P99 latencies. Degradation affects user experience.

Resource usage - CPU, memory, disk, network. Exhaustion causes failures.

External dependencies - API availability, database connections, third-party services.

Business metrics - User signups, purchases, key user actions. Drops indicate broken flows.

Before deploying code to production, verify:

Structured logging covers critical paths - Can you debug issues from logs alone?

Errors are captured to Sentry - Will you know when things break?

Performance is tracked - Can you identify slow operations?

Health checks are implemented - Can monitoring detect degraded state?

Correlation IDs flow through requests - Can you trace requests across services?

Alerts are configured - Will someone be notified when thresholds are exceeded?

When investigating production problems:

Start with error tracking - Check Sentry for exceptions and error patterns.

Review structured logs - Filter by user, request, or correlation ID to trace execution.

Check metrics - Look for anomalies in response times, error rates, resource usage.

Verify external dependencies - Confirm third-party services are operational.

Reproduce locally - Use production logs to recreate the scenario.

Add instrumentation - If debugging is difficult, add more logging and redeploy.

Treat observability as a core feature, not an afterthought:

Log before releasing - Instrumentation is harder to add after deployment.

Monitor what matters - Focus on user-impacting metrics, not vanity numbers.

Make logs searchable - Use consistent field names and structured data.

Review errors regularly - Sentry notifications should trigger investigation.

Celebrate transparency - Visible problems get fixed. Hidden problems accumulate.

Score

Total Score

65/100

Based on repository quality metrics

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

+10
説明文

100文字以上の説明がある

0/10
人気

GitHub Stars 100以上

0/15
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

0/5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

+5

Reviews

💬

Reviews coming soon