Back to list
prowler-cloud

prowler-api

by prowler-cloud

Prowler is the world’s most widely used open-source cloud security platform that automates security and compliance across any cloud environment.

12,764🍴 1,941📅 Jan 23, 2026

SKILL.md


name: prowler-api description: > Prowler API patterns: RLS, RBAC, providers, Celery tasks. Trigger: When working in api/ on models/serializers/viewsets/filters/tasks involving tenant isolation (RLS), RBAC, or provider lifecycle. license: Apache-2.0 metadata: author: prowler-cloud version: "1.2.0" scope: [root, api] auto_invoke: "Creating/modifying models, views, serializers" allowed-tools: Read, Edit, Write, Glob, Grep, Bash, WebFetch, WebSearch, Task

When to Use

Use this skill for Prowler-specific patterns:

  • Row-Level Security (RLS) / tenant isolation
  • RBAC permissions and role checks
  • Provider lifecycle and validation
  • Celery tasks with tenant context
  • Multi-database architecture (4-database setup)

For generic DRF patterns (ViewSets, Serializers, Filters, JSON:API), use django-drf skill.


Critical Rules

  • ALWAYS use rls_transaction(tenant_id) when querying outside ViewSet context
  • ALWAYS use get_role() before checking permissions (returns FIRST role only)
  • ALWAYS use @set_tenant then @handle_provider_deletion decorator order
  • ALWAYS use explicit through models for M2M relationships (required for RLS)
  • NEVER access Provider.objects without RLS context in Celery tasks
  • NEVER bypass RLS by using raw SQL or connection.cursor()
  • NEVER use Django's default M2M - RLS requires through models with tenant_id

Note: rls_transaction() accepts both UUID objects and strings - it converts internally via str(value).


Architecture Overview

4-Database Architecture

DatabaseAliasPurposeRLS
defaultprowler_userStandard API queriesYes
adminadminMigrations, auth bypassNo
replicaprowler_userRead-only queriesYes
admin_replicaadminAdmin read replicaNo
# When to use admin (bypasses RLS)
from api.db_router import MainRouter
User.objects.using(MainRouter.admin_db).get(id=user_id)  # Auth lookups

# Standard queries use default (RLS enforced)
Provider.objects.filter(connected=True)  # Requires rls_transaction context

RLS Transaction Flow

Request → Authentication → BaseRLSViewSet.initial()
                                    │
                                    ├─ Extract tenant_id from JWT
                                    ├─ SET api.tenant_id = 'uuid' (PostgreSQL)
                                    └─ All queries now tenant-scoped

Implementation Checklist

When implementing Prowler-specific API features:

#PatternReferenceKey Points
1RLS Modelsapi/rls.pyInherit RowLevelSecurityProtectedModel, add constraint
2RLS Transactionsapi/db_utils.pyUse rls_transaction(tenant_id) context manager
3RBAC Permissionsapi/rbac/permissions.pyget_role(), get_providers(), Permissions enum
4Provider Validationapi/models.pyvalidate_<provider>_uid() methods on Provider model
5Celery Taskstasks/tasks.py, api/decorators.py, config/celery.pyTask definitions, decorators (@set_tenant, @handle_provider_deletion), RLSTask base
6RLS Serializersapi/v1/serializers.pyInherit RLSSerializer to auto-inject tenant_id
7Through Modelsapi/models.pyALL M2M must use explicit through with tenant_id

Full file paths: See references/file-locations.md


Decision Trees

Which Base Model?

Tenant-scoped data       → RowLevelSecurityProtectedModel
Global/shared data       → models.Model + BaseSecurityConstraint (rare)
Partitioned time-series  → PostgresPartitionedModel + RowLevelSecurityProtectedModel
Soft-deletable           → Add is_deleted + ActiveProviderManager

Which Manager?

Normal queries           → Model.objects (excludes deleted)
Include deleted records  → Model.all_objects
Celery task context      → Must use rls_transaction() first

Which Database?

Standard API queries     → default (automatic via ViewSet)
Read-only operations     → replica (automatic for GET in BaseRLSViewSet)
Auth/admin operations    → MainRouter.admin_db
Cross-tenant lookups     → MainRouter.admin_db (use sparingly!)

Celery Task Decorator Order?

@shared_task(base=RLSTask, name="...", queue="...")
@set_tenant                    # First: sets tenant context
@handle_provider_deletion      # Second: handles deleted providers
def my_task(tenant_id, provider_id):
    pass

RLS Model Pattern

from api.rls import RowLevelSecurityProtectedModel, RowLevelSecurityConstraint

class MyModel(RowLevelSecurityProtectedModel):
    # tenant FK inherited from parent
    id = models.UUIDField(primary_key=True, default=uuid4, editable=False)
    name = models.CharField(max_length=255)
    inserted_at = models.DateTimeField(auto_now_add=True, editable=False)
    updated_at = models.DateTimeField(auto_now=True, editable=False)

    class Meta(RowLevelSecurityProtectedModel.Meta):
        db_table = "my_models"
        constraints = [
            RowLevelSecurityConstraint(
                field="tenant_id",
                name="rls_on_%(class)s",
                statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
            ),
        ]

    class JSONAPIMeta:
        resource_name = "my-models"

M2M Relationships (MUST use through models)

class Resource(RowLevelSecurityProtectedModel):
    tags = models.ManyToManyField(
        ResourceTag,
        through="ResourceTagMapping",  # REQUIRED for RLS
    )

class ResourceTagMapping(RowLevelSecurityProtectedModel):
    # Through model MUST have tenant_id for RLS
    resource = models.ForeignKey(Resource, on_delete=models.CASCADE)
    tag = models.ForeignKey(ResourceTag, on_delete=models.CASCADE)

    class Meta:
        constraints = [
            RowLevelSecurityConstraint(
                field="tenant_id",
                name="rls_on_%(class)s",
                statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
            ),
        ]

Async Task Response Pattern (202 Accepted)

For long-running operations, return 202 with task reference:

@action(detail=True, methods=["post"], url_name="connection")
def connection(self, request, pk=None):
    with transaction.atomic():
        task = check_provider_connection_task.delay(
            provider_id=pk, tenant_id=self.request.tenant_id
        )
    prowler_task = Task.objects.get(id=task.id)
    serializer = TaskSerializer(prowler_task)
    return Response(
        data=serializer.data,
        status=status.HTTP_202_ACCEPTED,
        headers={"Content-Location": reverse("task-detail", kwargs={"pk": prowler_task.id})}
    )

Providers (11 Supported)

ProviderUID FormatExample
AWS12 digits123456789012
AzureUUID v4a1b2c3d4-e5f6-...
GCP6-30 chars, lowercase, letter startmy-gcp-project
M365Valid domaincontoso.onmicrosoft.com
Kubernetes2-251 charsarn:aws:eks:...
GitHub1-39 charsmy-org
IaCGit URLhttps://github.com/user/repo.git
Oracle CloudOCID formatocid1.tenancy.oc1..
MongoDB Atlas24-char hex507f1f77bcf86cd799439011
Alibaba Cloud16 digits1234567890123456

Adding new provider: Add to ProviderChoices enum + create validate_<provider>_uid() staticmethod.


RBAC Permissions

PermissionControls
MANAGE_USERSUser CRUD, role assignments
MANAGE_ACCOUNTTenant settings
MANAGE_BILLINGBilling/subscription
MANAGE_PROVIDERSProvider CRUD
MANAGE_INTEGRATIONSIntegration config
MANAGE_SCANSScan execution
UNLIMITED_VISIBILITYSee all providers (bypasses provider_groups)

RBAC Visibility Pattern

def get_queryset(self):
    user_role = get_role(self.request.user)
    if user_role.unlimited_visibility:
        return Model.objects.filter(tenant_id=self.request.tenant_id)
    else:
        # Filter by provider_groups assigned to role
        return Model.objects.filter(provider__in=get_providers(user_role))

Celery Queues

QueuePurpose
scansProwler scan execution
overviewDashboard aggregations (severity, attack surface)
complianceCompliance report generation
integrationsExternal integrations (Jira, S3, Security Hub)
deletionProvider/tenant deletion (async)
backfillHistorical data backfill operations
scan-reportsOutput generation (CSV, JSON, HTML, PDF)

Task Composition (Canvas)

Use Celery's Canvas primitives for complex workflows:

PrimitiveUse For
chain()Sequential execution: A → B → C
group()Parallel execution: A, B, C simultaneously
CombinedChain with nested groups for complex workflows

Note: Use .si() (signature immutable) to prevent result passing. Use .s() if you need to pass results.

Examples: See assets/celery_patterns.py for chain, group, and combined patterns.


Beat Scheduling (Periodic Tasks)

OperationKey Points
Create scheduleIntervalSchedule.objects.get_or_create(every=24, period=HOURS)
Create periodic taskUse task name (not function), kwargs=json.dumps(...)
Delete scheduled taskPeriodicTask.objects.filter(name=...).delete()
Avoid race conditionsUse countdown=5 to wait for DB commit

Examples: See assets/celery_patterns.py for schedule_provider_scan pattern.


Advanced Task Patterns

@set_tenant Behavior

Modetenant_id in kwargstenant_id passed to function
@set_tenant (default)Popped (removed)NO - function doesn't receive it
@set_tenant(keep_tenant=True)Read but keptYES - function receives it

Key Patterns

PatternDescription
bind=TrueAccess self.request.id, self.request.retries
get_task_logger(__name__)Proper logging in Celery tasks
SoftTimeLimitExceededCatch to save progress before hard kill
countdown=30Defer execution by N seconds
eta=datetime(...)Execute at specific time

Examples: See assets/celery_patterns.py for all advanced patterns.


Celery Configuration

SettingValuePurpose
BROKER_VISIBILITY_TIMEOUT86400 (24h)Prevent re-queue for long tasks
CELERY_RESULT_BACKENDdjango-dbStore results in PostgreSQL
CELERY_TASK_TRACK_STARTEDTrueTrack when tasks start
soft_time_limitTask-specificRaises SoftTimeLimitExceeded
time_limitTask-specificHard kill (SIGKILL)

Full config: See assets/celery_patterns.py and actual files at config/celery.py, config/settings/celery.py.


UUIDv7 for Partitioned Tables

Finding and ResourceFindingMapping use UUIDv7 for time-based partitioning:

from uuid6 import uuid7
from api.uuid_utils import uuid7_start, uuid7_end, datetime_to_uuid7

# Partition-aware filtering
start = uuid7_start(datetime_to_uuid7(date_from))
end = uuid7_end(datetime_to_uuid7(date_to), settings.FINDINGS_TABLE_PARTITION_MONTHS)
queryset.filter(id__gte=start, id__lt=end)

Why UUIDv7? Time-ordered UUIDs enable PostgreSQL to prune partitions during range queries.


Batch Operations with RLS

from api.db_utils import batch_delete, create_objects_in_batches, update_objects_in_batches

# Delete in batches (RLS-aware)
batch_delete(tenant_id, queryset, batch_size=1000)

# Bulk create with RLS
create_objects_in_batches(tenant_id, Finding, objects, batch_size=500)

# Bulk update with RLS
update_objects_in_batches(tenant_id, Finding, objects, fields=["status"], batch_size=500)

Security Patterns

Full examples: See assets/security_patterns.py

Tenant Isolation Summary

PatternRule
RLS in ViewSetsAutomatic via BaseRLSViewSet - tenant_id from JWT
RLS in CeleryMUST use @set_tenant + rls_transaction(tenant_id)
Cross-tenant validationDefense-in-depth: verify obj.tenant_id == request.tenant_id
Never trust user inputUse request.tenant_id from JWT, never request.data.get("tenant_id")
Admin DB bypassOnly for cross-tenant admin ops - exposes ALL tenants' data

Celery Task Security Summary

PatternRule
Named tasks onlyNEVER use dynamic task names from user input
Validate argumentsCheck UUID format before database queries
Safe queuingUse transaction.on_commit() to enqueue AFTER commit
Modern retriesUse autoretry_for, retry_backoff, retry_jitter
Time limitsSet soft_time_limit and time_limit to prevent hung tasks
IdempotencyUse update_or_create or idempotency keys

Quick Reference

# Safe task queuing - task only enqueued after transaction commits
with transaction.atomic():
    provider = Provider.objects.create(**data)
    transaction.on_commit(
        lambda: verify_provider_connection.delay(
            tenant_id=str(request.tenant_id),
            provider_id=str(provider.id)
        )
    )

# Modern retry pattern
@shared_task(
    base=RLSTask,
    bind=True,
    autoretry_for=(ConnectionError, TimeoutError, OperationalError),
    retry_backoff=True,
    retry_backoff_max=600,
    retry_jitter=True,
    max_retries=5,
    soft_time_limit=300,
    time_limit=360,
)
@set_tenant
def sync_provider_data(self, tenant_id, provider_id):
    with rls_transaction(tenant_id):
        # ... task logic
        pass

# Idempotent task - safe to retry
@shared_task(base=RLSTask, acks_late=True)
@set_tenant
def process_finding(tenant_id, finding_uid, data):
    with rls_transaction(tenant_id):
        Finding.objects.update_or_create(uid=finding_uid, defaults=data)

Production Deployment Checklist

Full settings: See references/production-settings.md

Run before every production deployment:

cd api && poetry run python src/backend/manage.py check --deploy

Critical Settings

SettingProduction ValueRisk if Wrong
DEBUGFalseExposes stack traces, settings, SQL queries
SECRET_KEYEnv var, rotatedSession hijacking, CSRF bypass
ALLOWED_HOSTSExplicit listHost header attacks
SECURE_SSL_REDIRECTTrueCredentials sent over HTTP
SESSION_COOKIE_SECURETrueSession cookies over HTTP
CSRF_COOKIE_SECURETrueCSRF tokens over HTTP
SECURE_HSTS_SECONDS31536000 (1 year)Downgrade attacks
CONN_MAX_AGE60 or higherConnection pool exhaustion

Commands

# Development
cd api && poetry run python src/backend/manage.py runserver
cd api && poetry run python src/backend/manage.py shell

# Celery
cd api && poetry run celery -A config.celery worker -l info -Q scans,overview
cd api && poetry run celery -A config.celery beat -l info

# Testing
cd api && poetry run pytest -x --tb=short

# Production checks
cd api && poetry run python src/backend/manage.py check --deploy

Resources

Local References

  • Generic DRF Patterns: Use django-drf skill
  • API Testing: Use prowler-test-api skill

Prerequisite: Install Context7 MCP server for up-to-date documentation lookup.

When implementing or debugging Prowler-specific patterns, query these libraries via mcp_context7_query-docs:

LibraryContext7 IDUse For
Celery/websites/celeryq_dev_en_stableTask patterns, queues, error handling
django-celery-beat/celery/django-celery-beatPeriodic task scheduling
Django/websites/djangoproject_en_5_2Models, ORM, constraints, indexes

Example queries:

mcp_context7_query-docs(libraryId="/websites/celeryq_dev_en_stable", query="shared_task decorator retry patterns")
mcp_context7_query-docs(libraryId="/celery/django-celery-beat", query="periodic task database scheduler")
mcp_context7_query-docs(libraryId="/websites/djangoproject_en_5_2", query="model constraints CheckConstraint UniqueConstraint")

Note: Use mcp_context7_resolve-library-id first if you need to find the correct library ID.

Score

Total Score

90/100

Based on repository quality metrics

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

+10
説明文

100文字以上の説明がある

+10
人気

GitHub Stars 1000以上

+15
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

+5
Issue管理

オープンIssueが50未満

0/5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

+5

Reviews

💬

Reviews coming soon