
trust-safety
by SylphxAI
ð AI development platform with MEP architecture - stop writing prompts, start building with 90% less typing
SKILL.md
name: trust-safety description: Trust and safety - abuse prevention, rate limiting. Use when fighting bad actors.
Trust Safety Guideline
Tech Stack
- Analytics: PostHog
- Database: Neon (Postgres)
- Workflows: Upstash Workflows + QStash
Non-Negotiables
- All enforcement actions must be auditable (who/when/why)
- Appeals process must exist for affected users
- Graduated response levels must be defined (warn â restrict â suspend â ban)
Context
Trust & safety is about protecting users â from each other and from malicious actors. Every platform eventually attracts abuse. The question is whether you're prepared for it or scrambling to react.
Consider: what would a bad actor try to do? How would we detect it? How would we respond? What about the false positives â innocent users caught by automated systems? A good T&S system is effective against abuse AND fair to legitimate users.
Driving Questions
- What would a motivated bad actor try to do on this platform?
- How would we detect coordinated abuse or bot networks?
- What happens when automated moderation gets it wrong?
- How do affected users appeal decisions, and is it fair?
- What abuse patterns exist that we haven't addressed?
- What would make users trust that we're protecting them?
ã¹ã³ã¢
ç·åã¹ã³ã¢
ãªããžããªã®åè³ªææšã«åºã¥ãè©äŸ¡
SKILL.mdãã¡ã€ã«ãå«ãŸããŠãã
ã©ã€ã»ã³ã¹ãèšå®ãããŠãã
100æå以äžã®èª¬æããã
GitHub Stars 100以äž
3ã¶æä»¥å ã«æŽæ°
10å以äžãã©ãŒã¯ãããŠãã
ãªãŒãã³Issueã50æªæº
ããã°ã©ãã³ã°èšèªãèšå®ãããŠãã
1ã€ä»¥äžã®ã¿ã°ãèšå®ãããŠãã
ã¬ãã¥ãŒ
ã¬ãã¥ãŒæ©èœã¯è¿æ¥å ¬éäºå®ã§ã

