Back to list
frankxai

arcanea-performance-tuning

by frankxai

Open source agents, skills, and lore for AI-powered creative work. Transform your AI assistant into a creative companion.

1🍴 0📅 Jan 24, 2026

SKILL.md


name: arcanea-performance-tuning description: Master the art of making systems fast. Profiling, optimization, caching, and the wisdom to know when performance matters and when it doesn't. Measure twice, optimize once. version: 2.0.0 author: Arcanea tags: [performance, optimization, profiling, speed, tuning, development] triggers:

  • performance
  • optimization
  • slow
  • speed up
  • profiling
  • bottleneck

The Performance Tuning Codex

"Premature optimization is the root of all evil. But mature optimization is the root of all delight."


The Performance Philosophy

The Golden Rules

RULE 1: MEASURE FIRST
Don't guess where the bottleneck is.
Profile. Measure. Prove.

RULE 2: OPTIMIZE THE RIGHT THING
80% of time is spent in 20% of code.
Find that 20%.

RULE 3: SET TARGETS
"Faster" is not a goal.
"Under 200ms" is a goal.

RULE 4: REGRESSION PREVENTION
Performance is easy to lose.
Benchmark continuously.

The Optimization Hierarchy

╔═══════════════════════════════════════════════════════════════════╗
║                    OPTIMIZATION HIERARCHY                          ║
║              (Optimize in this order)                              ║
╠═══════════════════════════════════════════════════════════════════╣
║                                                                    ║
║   1. ALGORITHM         │ O(n²) → O(n log n) = massive wins       ║
║   2. DATA STRUCTURE    │ Right structure for access pattern       ║
║   3. I/O               │ Network, disk, database calls            ║
║   4. MEMORY            │ Allocation, garbage collection           ║
║   5. CPU               │ Hot loops, cache efficiency              ║
║                                                                    ║
║   (Don't optimize #5 if #1-4 are the problem)                     ║
║                                                                    ║
╚═══════════════════════════════════════════════════════════════════╝

Profiling

Types of Profiling

CPU PROFILING:
• What functions take the most time?
• Where are the hot paths?
• What's the call graph?

MEMORY PROFILING:
• Where is memory allocated?
• What's causing garbage collection?
• Are there memory leaks?

I/O PROFILING:
• What queries are slow?
• What network calls are made?
• What files are accessed?

TRACE PROFILING:
• What's the full request lifecycle?
• Where do requests spend time?
• What's the concurrency pattern?

The Profiling Process

┌─────────────────────────────────────────────────────────────────┐
│                    THE PROFILING CYCLE                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   1. ESTABLISH BASELINE                                          │
│      Measure current performance                                 │
│      Record metrics: latency, throughput, resource usage         │
│                                                                  │
│   2. SET TARGET                                                   │
│      Define acceptable performance                               │
│      "P95 latency < 200ms"                                       │
│                                                                  │
│   3. PROFILE                                                      │
│      Identify bottlenecks                                        │
│      Focus on top 3 issues                                       │
│                                                                  │
│   4. HYPOTHESIZE                                                  │
│      Why is this slow?                                           │
│      What would make it faster?                                  │
│                                                                  │
│   5. OPTIMIZE                                                     │
│      Make ONE change                                             │
│      Keep it isolated                                            │
│                                                                  │
│   6. MEASURE                                                      │
│      Did it help?                                                │
│      Did it hurt anything else?                                  │
│                                                                  │
│   7. REPEAT                                                       │
│      Until target reached                                        │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Common Performance Patterns

The N+1 Query Problem

BAD: N+1 queries
┌──────────────────────────────────────────────────────────────┐
│ // Get all users (1 query)                                   │
│ users = db.query("SELECT * FROM users")                      │
│                                                              │
│ // For each user, get their orders (N queries)               │
│ for user in users:                                           │
│     orders = db.query("SELECT * FROM orders WHERE user_id=?")│
└──────────────────────────────────────────────────────────────┘

GOOD: Eager loading
┌──────────────────────────────────────────────────────────────┐
│ // Single query with JOIN                                    │
│ SELECT users.*, orders.*                                     │
│ FROM users                                                   │
│ LEFT JOIN orders ON orders.user_id = users.id                │
│                                                              │
│ // Or batch loading                                          │
│ SELECT * FROM orders WHERE user_id IN (1, 2, 3, 4, 5)       │
└──────────────────────────────────────────────────────────────┘

Caching Strategies

╔═══════════════════════════════════════════════════════════════════╗
║                    CACHING STRATEGIES                              ║
╠═══════════════════════════════════════════════════════════════════╣
║                                                                    ║
║   CACHE-ASIDE (Lazy Loading)                                       ║
║   ┌─────────┐                                                      ║
║   │ Request │──┬──▶ Cache Hit ──▶ Return                          ║
║   └─────────┘  │                                                   ║
║                └──▶ Cache Miss ──▶ DB ──▶ Cache ──▶ Return        ║
║                                                                    ║
║   WRITE-THROUGH                                                    ║
║   ┌─────────┐                                                      ║
║   │  Write  │──▶ Cache ──▶ DB ──▶ Confirm                         ║
║   └─────────┘                                                      ║
║                                                                    ║
║   WRITE-BEHIND (Async)                                             ║
║   ┌─────────┐                                                      ║
║   │  Write  │──▶ Cache ──▶ Confirm                                ║
║   └─────────┘      │                                               ║
║                    └──▶ [Later] ──▶ DB                             ║
║                                                                    ║
╚═══════════════════════════════════════════════════════════════════╝

CACHE INVALIDATION:
• TTL (Time To Live) - Simple but may serve stale data
• Event-based - Invalidate on writes
• Tag-based - Group related items

Connection Pooling

WITHOUT POOLING:
┌──────────┐     ┌──────────┐
│ Request  │──▶──│ Connect  │──▶ 50-100ms overhead
└──────────┘     └──────────┘

WITH POOLING:
┌──────────┐     ┌──────────────┐     ┌──────────┐
│ Request  │──▶──│ Pool Manager │──▶──│ Reuse    │──▶ ~0ms
└──────────┘     └──────────────┘     └──────────┘

POOL CONFIGURATION:
• Min connections: Keep warm for base load
• Max connections: Limit to prevent exhaustion
• Idle timeout: Release unused connections
• Connection lifetime: Prevent stale connections

Lazy Loading

EAGER (Load everything):
┌────────────────────────────────────────────────────────┐
│ class User:                                            │
│     def __init__(self, id):                            │
│         self.profile = load_profile(id)   # Always     │
│         self.orders = load_orders(id)     # Always     │
│         self.preferences = load_prefs(id) # Always     │
└────────────────────────────────────────────────────────┘

LAZY (Load on demand):
┌────────────────────────────────────────────────────────┐
│ class User:                                            │
│     def __init__(self, id):                            │
│         self._id = id                                  │
│         self._orders = None                            │
│                                                        │
│     @property                                          │
│     def orders(self):                                  │
│         if self._orders is None:                       │
│             self._orders = load_orders(self._id)       │
│         return self._orders                            │
└────────────────────────────────────────────────────────┘

Database Optimization

Index Optimization

WHEN TO INDEX:
✓ Columns in WHERE clauses
✓ Columns in JOIN conditions
✓ Columns in ORDER BY
✓ Columns with high selectivity

WHEN NOT TO INDEX:
✗ Small tables (full scan is faster)
✗ Columns with low selectivity (gender, boolean)
✗ Tables with heavy writes (index maintenance cost)
✗ Columns rarely queried

COMPOSITE INDEX ORDER:
• Equality conditions first
• Range conditions last
• Most selective first

INDEX (status, created_at)  -- status = 'active' AND created_at > ?

Query Optimization

EXPLAIN ANALYZE:
Always explain before optimizing.

┌────────────────────────────────────────────────────────────────┐
│ EXPLAIN ANALYZE                                                 │
│ SELECT * FROM orders                                            │
│ WHERE user_id = 123 AND status = 'pending'                      │
│ ORDER BY created_at DESC                                        │
│ LIMIT 10;                                                       │
│                                                                 │
│ Look for:                                                       │
│ • Seq Scan (bad on large tables)                               │
│ • Index Scan (good)                                            │
│ • Sort (expensive if not indexed)                              │
│ • Rows vs estimated rows (accuracy of stats)                   │
└────────────────────────────────────────────────────────────────┘

COMMON FIXES:
• Add missing indexes
• Rewrite subqueries as JOINs
• Use LIMIT for pagination
• Avoid SELECT * in production
• Partition large tables

Frontend Performance

Critical Rendering Path

┌─────────┐     ┌─────────┐     ┌─────────┐     ┌─────────┐
│  HTML   │──▶──│  CSS    │──▶──│   JS    │──▶──│ Render  │
│  Parse  │     │  Parse  │     │ Execute │     │  Paint  │
└─────────┘     └─────────┘     └─────────┘     └─────────┘
     │               │               │
     ▼               ▼               ▼
    DOM            CSSOM          Execute
   Build           Build         & Modify

OPTIMIZATION:
1. Minimize critical resources
2. Minimize critical bytes
3. Minimize critical path length

Core Web Vitals

LCP (Largest Contentful Paint):
Target: < 2.5s
• Optimize images
• Preload critical resources
• Use CDN

FID (First Input Delay):
Target: < 100ms
• Break up long tasks
• Defer non-critical JS
• Use web workers

CLS (Cumulative Layout Shift):
Target: < 0.1
• Set image dimensions
• Reserve space for ads
• Avoid inserting content above fold

Bundle Optimization

CODE SPLITTING:
// Instead of one large bundle
import { everything } from 'huge-library';

// Load on demand
const HeavyComponent = lazy(() => import('./HeavyComponent'));

TREE SHAKING:
// Bad: imports everything
import _ from 'lodash';

// Good: imports only what's used
import { debounce } from 'lodash-es';

COMPRESSION:
• Gzip: 70-90% reduction
• Brotli: 15-20% better than Gzip
• Enable on server and CDN

Concurrency & Parallelism

Async Patterns

SEQUENTIAL (Slow):
┌────────────────────────────────────────────────────────────────┐
│ result1 = await fetchUser()      // 100ms                      │
│ result2 = await fetchOrders()    // 150ms                      │
│ result3 = await fetchProducts()  // 120ms                      │
│ // Total: 370ms                                                │
└────────────────────────────────────────────────────────────────┘

PARALLEL (Fast):
┌────────────────────────────────────────────────────────────────┐
│ [user, orders, products] = await Promise.all([                 │
│     fetchUser(),                                               │
│     fetchOrders(),                                             │
│     fetchProducts()                                            │
│ ])                                                             │
│ // Total: 150ms (slowest call)                                │
└────────────────────────────────────────────────────────────────┘

Rate Limiting & Backpressure

RATE LIMITING:
┌────────────────────────────────────────────────────────────────┐
│ Token Bucket Algorithm:                                         │
│                                                                 │
│ • Bucket has capacity (e.g., 100 tokens)                       │
│ • Tokens added at fixed rate (e.g., 10/second)                 │
│ • Each request consumes a token                                │
│ • No tokens = request rejected                                 │
└────────────────────────────────────────────────────────────────┘

BACKPRESSURE:
┌────────────────────────────────────────────────────────────────┐
│ When producer is faster than consumer:                          │
│                                                                 │
│ Options:                                                        │
│ • Drop: Discard excess (lossy)                                 │
│ • Buffer: Queue until processed (memory risk)                  │
│ • Sample: Process every Nth item                               │
│ • Slow down: Signal producer to wait                           │
└────────────────────────────────────────────────────────────────┘

Monitoring & Metrics

Key Metrics

THE FOUR GOLDEN SIGNALS:
┌─────────────────────────────────────────────────────────────┐
│ 1. LATENCY    │ Time to serve a request                    │
│ 2. TRAFFIC    │ Requests per second                        │
│ 3. ERRORS     │ Rate of failed requests                    │
│ 4. SATURATION │ How "full" the service is                  │
└─────────────────────────────────────────────────────────────┘

PERCENTILES:
• P50 (median): Typical experience
• P95: Most users' worst experience
• P99: Tail latency (important!)
• Max: Absolute worst case

Note: Average is misleading.
      A few slow requests hide in the average.

Benchmarking

MICRO-BENCHMARKS:
• Test specific functions
• Isolate from I/O
• Run many iterations
• Beware of JIT warmup

LOAD TESTING:
• Simulate realistic traffic
• Measure at various loads
• Find the breaking point
• Test failure scenarios

TOOLS:
• k6, Artillery, Locust (load testing)
• wrk, hey (HTTP benchmarking)
• hyperfine (CLI benchmarking)

Quick Reference

Performance Checklist

□ Profiled to find actual bottlenecks
□ Set measurable performance targets
□ Optimized hot paths first
□ Added appropriate caching
□ Minimized I/O operations
□ Used connection pooling
□ Indexed frequently queried columns
□ Implemented lazy loading where appropriate
□ Set up performance monitoring
□ Established performance regression tests

Common Performance Wins

| Problem              | Solution                    |
|----------------------|-----------------------------|
| N+1 queries          | Eager loading, batch        |
| Slow queries         | Add indexes, optimize SQL   |
| Large payloads       | Pagination, compression     |
| Repeated computation | Caching, memoization        |
| Synchronous waits    | Async, parallel execution   |
| Cold starts          | Warmup, connection pools    |
| Large bundles        | Code splitting, tree shake  |
| Slow images          | Lazy load, WebP, CDN        |

The Performance Mantras

"Measure first, optimize second"
"The fastest code is code that doesn't run"
"Cache invalidation is hard; TTL is your friend"
"Profile in production, not just development"
"Optimize for the common case"

"Performance is not about making things fast. It's about removing what makes things slow."

Score

Total Score

65/100

Based on repository quality metrics

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

0/10
説明文

100文字以上の説明がある

+10
人気

GitHub Stars 100以上

0/15
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

0/5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

+5

Reviews

💬

Reviews coming soon