スキル一覧に戻る
mats16

databricks-jobs

by mats16

Claude Code on Databricks

0🍴 0📅 2026年1月23日
GitHubで見るManusで実行

SKILL.md


name: databricks-jobs description: | Databricks Jobs management, monitoring, and debugging. Use for job status checks, run history, failure investigation, job execution, and historical analysis. Triggers: job status, run history, job failure, debug job, run job, cancel job, repair run, job metrics, failure analysis. Real-time operations use CLI (databricks jobs). Historical analysis uses System Tables (system.lakeflow.*). metadata: version: 1.0.0

Databricks Jobs

Quick Reference

Extract job_id and run_id from URLs first (see Extracting IDs from URLs).

OperationToolCommand/Table
Real-time statusCLIdatabricks jobs get-run <run_id>
List runsCLIdatabricks jobs list-runs --job-id <job_id>
Run jobCLIdatabricks jobs run-now <job_id>
Cancel runCLIdatabricks jobs cancel-run <run_id>
Repair failedCLIdatabricks jobs repair-run <run_id> --rerun-all-failed-tasks
Historical analysisSQLsystem.lakeflow.job_run_timeline

Extracting IDs from URLs

https://<host>/jobs/<job_id>
https://<host>/jobs/<job_id>/runs/<run_id>

Example: https://example.cloud.databricks.com/jobs/987402714328091/runs/304618225028273

  • job_id: 987402714328091
  • run_id: 304618225028273

CLI Command Syntax

Critical: Some commands use positional args, others require flags.

CommandSyntaxExample
jobs getPositionaldatabricks jobs get 123
jobs get-runPositionaldatabricks jobs get-run 456
jobs get-run-outputPositionaldatabricks jobs get-run-output 456
jobs list-runsFlag requireddatabricks jobs list-runs --job-id 123
jobs run-nowPositionaldatabricks jobs run-now 123
jobs cancel-runPositionaldatabricks jobs cancel-run 456
jobs repair-runPositionaldatabricks jobs repair-run 456 --rerun-all-failed-tasks

Common Mistakes

WrongCorrect
databricks jobs get --job-id 123databricks jobs get 123
databricks jobs get-run --run-id 456databricks jobs get-run 456
databricks jobs list-runs 123databricks jobs list-runs --job-id 123

Core Workflows

Check Run Status

databricks jobs get-run <run_id>```

Key fields:
- `state.result_state` - Result (SUCCESS, FAILED, TIMED_OUT, CANCELED)
- `state.state_message` - Error message
- `tasks[]` - Status of each task

### Investigate Failed Run

```bash
# Step 1: Get run overview
databricks jobs get-run <run_id> -o json | jq '.state'

# Step 2: Find failed tasks
databricks jobs get-run <run_id> -o json | jq '.tasks[] | select(.state.result_state == "FAILED") | {task_key, state}'

# Step 3: Get error details
databricks jobs get-run-output <run_id> -o json | jq '{error, error_trace}'

# Step 4: Repair (rerun failed tasks)
databricks jobs repair-run <run_id> --rerun-all-failed-tasks```

For detailed troubleshooting: See [Troubleshooting Guide](references/troubleshooting.md)

### List and Search Jobs

```bash
# List all jobs
databricks jobs list
# Search by name
databricks jobs list -o json | jq '.jobs[] | select(.settings.name | contains("keyword"))'

# Get job definition
databricks jobs get <job_id>```

### List Runs

```bash
# All runs for a job
databricks jobs list-runs --job-id <job_id>
# Active runs only
databricks jobs list-runs --job-id <job_id> --active-only```

### Execute Jobs

```bash
# Run immediately
databricks jobs run-now <job_id>
# Run with parameters
databricks jobs run-now <job_id> --notebook-params '{"param1": "value1"}'databricks jobs run-now <job_id> --python-params '["arg1", "arg2"]'databricks jobs run-now <job_id> --jar-params '["arg1", "arg2"]'```

### Cancel and Repair

```bash
# Cancel a run
databricks jobs cancel-run <run_id>

# Rerun all failed tasks
databricks jobs repair-run <run_id> --rerun-all-failed-tasks
# Rerun specific tasks
databricks jobs repair-run <run_id> --rerun-tasks '["task_key1", "task_key2"]'```

### Multi-task Job Investigation

```bash
# List task definitions
databricks jobs get <job_id> -o json | jq '.settings.tasks[] | {task_key, description}'

# Check task states in a run
databricks jobs get-run <run_id> -o json | jq '.tasks[] | {task_key, state, start_time, end_time}'

# Get task dependencies
databricks jobs get <job_id> -o json | jq '.settings.tasks[] | {task_key, depends_on}'

Check Schedule and Triggers

# Schedule settings
databricks jobs get <job_id> -o json | jq '.settings.schedule'

# Trigger settings (file arrival, etc.)
databricks jobs get <job_id> -o json | jq '.settings.trigger'

System Tables (Historical Analysis)

Use for trend analysis and metrics. Data may be delayed by several hours.

TablePurpose
system.lakeflow.jobsJob definitions
system.lakeflow.job_run_timelineRun history
system.lakeflow.job_task_run_timelineTask run details

Quick Queries

-- Failed runs in the last 24 hours
SELECT j.name, r.run_id, r.result_state, r.period_start_time
FROM system.lakeflow.job_run_timeline r
JOIN system.lakeflow.jobs j USING (job_id)
WHERE r.result_state = 'FAILED'
  AND r.period_start_time >= CURRENT_DATE - INTERVAL 1 DAY
ORDER BY r.period_start_time DESC;

-- Run history for a specific job
SELECT run_id, result_state, period_start_time,
  TIMESTAMPDIFF(MINUTE, period_start_time, period_end_time) AS duration_min
FROM system.lakeflow.job_run_timeline
WHERE job_id = <job_id>  -- Use job_id extracted from URL
ORDER BY period_start_time DESC
LIMIT 20;

For comprehensive historical analysis queries: See System Tables Reference

result_state Reference

StateDescription
SUCCESSCompleted successfully
FAILEDFailed with error
TIMED_OUTExceeded timeout
CANCELEDCanceled by user/system
RUNNINGCurrently executing
PENDINGWaiting to run
SKIPPEDSkipped (dependency failed)

References

スコア

総合スコア

65/100

リポジトリの品質指標に基づく評価

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

+10
説明文

100文字以上の説明がある

0/10
人気

GitHub Stars 100以上

0/15
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

0/5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

+5

レビュー

💬

レビュー機能は近日公開予定です