Back to list
aiskillstore

backend-hang-debug

by aiskillstore

Security-audited skills for Claude, Codex & Claude Code. One-click install, quality verified.

102🍴 3📅 Jan 23, 2026

SKILL.md


name: backend-hang-debug description: Diagnose and fix FastAPI hangs caused by blocking ThreadPoolExecutor shutdown in the news stream route; includes py-spy capture and non-blocking executor pattern.

Backend Hang Debug

Purpose

  • Detect and resolve event-loop hangs where the FastAPI app stops responding (e.g., curl http://localhost:8000/ times out) due to synchronous executor shutdown in the SSE news stream.
  • Provide a repeatable triage flow using py-spy to capture live stacks and pinpoint blocking code.

Scope

  • Backend: backend/app/api/routes/stream.py (news stream), backend/app/services/rss_ingestion.py (RSS workers), startup processes.
  • Tooling: py-spy for live stack dumps; curl with timeouts for smoke tests.

Quick Triage

  1. Reproduce hang: curl -m 5 http://localhost:8000/ and curl -m 5 http://localhost:8000/health; note timeouts.
  2. Process check: ss -tlnp | grep 8000 to confirm listener; ls /proc/$(pgrep -f "uvicorn app.main")/fd | wc -l to rule out FD leak.
  3. Stack capture (inside backend venv): uv pip install py-spy then sudo /home/bender/classwork/Thesis/backend/.venv/bin/py-spy dump --pid $(pgrep -f "uvicorn app.main") (and worker pid if multiprocess). Look for ThreadPoolExecutor.shutdown in api/routes/stream.py frames.

Fix Pattern (non-blocking executor)

  • Replace synchronous context manager with ThreadPoolExecutor(...): inside event_generator with a long-lived executor plus explicit non-blocking shutdown:
    • Create executor outside the context manager.
    • On client disconnect, cancel pending futures instead of awaiting shutdown.
    • In finally, call executor.shutdown(wait=False, cancel_futures=True).
  • Rationale: context manager calls shutdown(wait=True), blocking the event loop if RSS worker threads hang on network I/O.

Implementation Steps

  1. Update stream executor usage in backend/app/api/routes/stream.py:
    • Instantiate executor = concurrent.futures.ThreadPoolExecutor(max_workers=5).
    • Dispatch work via loop.run_in_executor(executor, _process_source_with_debug, ...).
    • On disconnect, cancel() pending futures.
    • In finally, executor.shutdown(wait=False, cancel_futures=True).
  2. Keep RSS executor as-is (rss_ingestion.py) since it runs in background threads, but ensure request timeouts remain reasonable (currently 60s per RSS requests.get).
  3. Retest:
    • Restart uvicorn; curl -m 5 http://localhost:8000/health should respond.
    • Start a stream request and abort the client; server must stay responsive.
    • Re-run py-spy dump to verify no ThreadPoolExecutor.shutdown(wait=True) frames in main thread.

Verification Checklist

  • curl -m 5 http://localhost:8000/ returns a response (no hang).
  • curl -m 5 http://localhost:8000/health succeeds.
  • Aborting /news/stream does not freeze subsequent requests.
  • py-spy dump shows event loop not blocked on ThreadPoolExecutor.shutdown.
  • Frontend no longer stalls waiting on root/health while backend is busy with streams.

Notes & Future Hardening

  • Consider adding request timeout middleware to fail fast on slow handlers.
  • Add per-source network timeouts and shorter retries for RSS feeds to reduce long-lived threads.
  • If multi-worker uvicorn is used, run py-spy on each worker pid when diagnosing hangs.

Score

Total Score

60/100

Based on repository quality metrics

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

0/10
説明文

100文字以上の説明がある

0/10
人気

GitHub Stars 100以上

+5
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

0/5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

+5

Reviews

💬

Reviews coming soon