stata-mcp

Name: stata-mcp
Rating: 75
Author: tmonk

by tmonk

A lightweight Model Context Protocol (MCP) server for Stata. Execute commands, inspect data, retrieve stored results (r()/e()), and view graphs in your chat interface. Built for economists who want to integrate LLM assistance into their Stata workflow.

⭐ 15🍴 1📅 Jan 23, 2026

ai econometrics economics llm mcp mcp-server stata

View on GitHub Run in Manus

SKILL.md

name: stata-mcp description: Run or debug Stata workflows through the local io.github.tmonk/mcp-stata server. Use when users mention Stata commands, .do files, r()/e() results, dataset inspection, Stata graph exports, or data browsing with sorting/filtering.

Stata MCP Skill

Instructions

Ensure the stata MCP server is registered (see project README for config) and request it if not already active.
When the user asks for Stata work:
- Use run_command for ad-hoc syntax (trace=True for call stacks, raw=True for plain output).
- Use load_data before analyses that require datasets.
- Use get_data, describe, codebook, or get_variable_list to inspect data.
- Use run_do_file for provided .do scripts.
- Use export_graph/export_graphs_all for visualization requests.
- Use get_help when the user wants Stata documentation.
- Use get_stored_results to return r()/e() scalars/macros after commands for validation.
- Use read_log to tail or retrieve output from long-running commands.
- Use get_ui_channel to obtain a localhost HTTP endpoint for high-volume data browsing.
Surface rc/stderr info back to the user, referencing r()/e() codes.
If Stata isn't auto-discovered, remind the user to set STATA_PATH (examples in README).

Tool quick reference

Command Execution

run_command(code, echo=True, as_json=True, trace=False, raw=False, max_output_lines=None): Run Stata syntax.
- code: The Stata command(s) to execute.
- echo: Include the command itself in output (default: True).
- as_json: Return JSON envelope with rc/stdout/stderr/error (default: True).
- trace: Enable set trace on for deeper error diagnostics (default: False).
- raw: Return plain stdout/error message instead of JSON (default: False).
- max_output_lines: Truncate output to this many lines (default: None for no truncation).
- Note: Always writes output to a temporary log file and emits a notifications/logMessage with {"event":"log_path","path":"..."} so the client can tail it locally.
run_do_file(path, echo=True, as_json=True, trace=False, raw=False, max_output_lines=None): Execute .do files.
- path: Path to the .do file.
- echo: Include commands in output (default: True).
- as_json: Return JSON envelope (default: True).
- trace: Enable trace mode for debugging (default: False).
- raw: Return plain output instead of JSON (default: False).
- max_output_lines: Truncate output to this many lines (default: None).
- Note: Always writes output to a temporary log file and emits incremental notifications/progress when the client provides a progress token/callback.
read_log(path, offset=0, max_bytes=65536): Read a slice of a previously-provided log file.
- path: Path to the log file (from notifications/logMessage).
- offset: Byte offset to start reading from (default: 0).
- max_bytes: Maximum bytes to read (default: 65536).
- Returns JSON: path, offset, next_offset, data.

Data Loading & Inspection

load_data(source, clear=True, as_json=True, raw=False, max_output_lines=None): Load data using sysuse/webuse/use heuristics.
- source: Dataset name, URL, or file path (e.g., "auto", "webuse nlsw88", "/path/to/file.dta").
- clear: Append , clear to replace existing data (default: True).
- as_json: Return JSON envelope (default: True).
- raw: Return plain output (default: False).
- max_output_lines: Truncate output to this many lines (default: None).
- Note: After loading, use UI channel for advanced filtering/sorting at scale.
get_data(start=0, count=50): Retrieve a slice of the active dataset as JSON.
- start: Zero-based index of first observation (default: 0).
- count: Number of observations to retrieve (default: 50, max: 500).
- Note: For advanced sorting/filtering at scale, use the UI channel endpoints (see get_ui_channel()).
describe(): Return variable descriptions, storage types, and labels.
get_variable_list(): Return JSON list of all variables with names, labels, and types.
codebook(variable, as_json=True, trace=False, raw=False, max_output_lines=None): Return codebook/summary for a specific variable.
- variable: Variable name to describe.
- as_json: Return JSON envelope (default: True).
- trace: Enable trace mode (default: False).
- raw: Return plain output (default: False).
- max_output_lines: Truncate output to this many lines (default: None).

Graph Management

list_graphs(): List all graphs in Stata's memory with active graph marked.
- Note: Graphs are automatically cached during command execution for instant exports.
export_graph(graph_name=None, format="pdf"): Export a stored graph to file.
- graph_name: Name of graph to export (from list_graphs); if None, exports active graph.
- format: Output format—"pdf" (default) or "png". Use "png" to view plots directly.
export_graphs_all(): Export all graphs in memory. Returns file paths.

Help & Results

get_help(topic, plain_text=False): Return Stata help text.
- topic: Command or help topic (e.g., "regress", "graph").
- plain_text: Return plain text instead of Markdown (default: False).
get_stored_results(): Return current r() and e() results as JSON after a command.

UI Data Browser

get_ui_channel(): Return a short-lived localhost HTTP endpoint + bearer token for the UI-only data browser.
- Returns JSON with baseUrl, token, expiresAt, and capabilities.
- Intended for VS Code extension UI to browse data at high volume (paging, filtering, sorting) without sending large payloads over MCP.
- Loopback only (binds to 127.0.0.1), requires bearer auth.
- Key endpoints (all require Authorization: Bearer <token> header):
  - GET /v1/dataset: Dataset identity and state
  - GET /v1/vars: Variable metadata
  - POST /v1/page: Page data with optional sorting (sortBy parameter)
  - POST /v1/arrow: Binary Arrow IPC stream
  - POST /v1/views: Create filtered view
  - POST /v1/views/:viewId/page: Page within filtered view (supports sorting)
  - POST /v1/views/:viewId/arrow: Arrow stream from filtered view
  - DELETE /v1/views/:viewId: Delete view
  - POST /v1/filters/validate: Validate filter expression
- Sorting: Use sortBy array in page requests (e.g., ["price"] for ascending, ["-price"] for descending, ["foreign", "-price"] for multi-level)
- Filtering: Filter expressions use Python boolean operators (==, !=, <, >, and, or); Stata-style &/| also accepted
- Server limits: maxLimit=500, maxVars=32767, maxChars=500, maxRequestBytes=1000000, maxArrowLimit=1000000
- Dataset tracking: datasetId used for cache invalidation; changing dataset invalidates view handles

Cancellation

Clients may cancel an in-flight request by sending the MCP notification notifications/cancelled with params.requestId set to the original tool call ID.
Pass a _meta.progressToken when invoking the tool if you want progress updates (optional).
Cancellation is best-effort and depends on Stata surfacing BreakError.

Error Reporting

All tools executing Stata commands support JSON envelopes (as_json=true) containing:
- rc: Return code from r()/c(rc)
- stdout: Standard output
- stderr: Standard error (captures "red text")
- message: Error message
- line: Line number (when Stata reports it)
- command: The command that was executed
- log_path: Path to log file for streaming (when applicable)
- snippet: Excerpt of error output
Stata-specific error codes (r(XXX)) are parsed and preserved
Use trace=true to enable set trace on for detailed program-defined error diagnostics
Set MCP_STATA_LOGLEVEL environment variable (e.g., DEBUG, INFO) to control server logging

MCP Resources

The server exposes these resources for MCP clients:

stata://data/summary → summarize
stata://data/metadata → describe
stata://graphs/list → graph list
stata://variables/list → variable list
stata://results/stored → stored r()/e() results

Graph review workflow

Call list_graphs() to see available plots and identify the active graph.
Use export_graphs_all() to fetch file paths for every graph; view them directly in the client.
For a single plot, call export_graph(graph_name="GraphName", format="png") to get a viewable file.
Compare the rendered PNGs to the user spec (titles, axes labels, legends, colors, filters); state whether the graph matches and what to change.

Examples

Run a regression

# Load sample data and run regression
load_data("auto")
run_command("regress price mpg")
get_stored_results()  # Retrieve coefficients and statistics

Export a histogram

# Create and export a graph
run_command("histogram price")
list_graphs()  # Confirm graph exists
export_graph(graph_name="Graph", format="png")  # Export for viewing

Debug a do-file

run_do_file("/path/to/analysis.do", trace=True)

Inspect data structure

load_data("nlsw88", clear=True)
describe()
get_variable_list()
codebook("wage")
get_data(start=0, count=10)

Read log output from long-running command

# After run_command emits a log_path notification
read_log("/tmp/stata_log_abc123.log", offset=0)
# Continue reading with next_offset for incremental output
read_log("/tmp/stata_log_abc123.log", offset=4096)

Advanced data browsing with sorting and filtering

# Get UI channel for high-volume data operations
get_ui_channel()  # Returns baseUrl, token, expiresAt

# Example UI channel usage (requires HTTP client):
# POST {baseUrl}/v1/page with Authorization: Bearer {token}
# Body: {"datasetId":"...","offset":0,"limit":50,"vars":["price","mpg"],"sortBy":["-price"]}

# Create filtered view for price < 5000
# POST {baseUrl}/v1/views
# Body: {"datasetId":"...","frame":"default","filterExpr":"price < 5000"}

# Page through filtered view with sorting
# POST {baseUrl}/v1/views/{viewId}/page
# Body: {"offset":0,"limit":50,"vars":["price","mpg"],"sortBy":["-price"]}

Score

Total Score

75/100

Based on repository quality metrics

✓SKILL.md

SKILL.mdファイルが含まれている

+20

✓LICENSE

ライセンスが設定されている

+10

✓説明文

100文字以上の説明がある

+10

○人気

GitHub Stars 100以上

0/15

✓最近の活動

1ヶ月以内に更新

+10

○フォーク

10回以上フォークされている

0/5

✓Issue管理

オープンIssueが50未満

✓言語

プログラミング言語が設定されている

✓タグ

1つ以上のタグが設定されている

Reviews

💬

Reviews coming soon

stata-mcp

SKILL.md

name: stata-mcp description: Run or debug Stata workflows through the local io.github.tmonk/mcp-stata server. Use when users mention Stata commands, .do files, r()/e() results, dataset inspection, Stata graph exports, or data browsing with sorting/filtering.

Stata MCP Skill

Instructions

Tool quick reference

Command Execution

Data Loading & Inspection

Graph Management

Help & Results

UI Data Browser

Cancellation

Error Reporting

MCP Resources

Graph review workflow

Examples

Run a regression

Export a histogram

Debug a do-file

Inspect data structure

Read log output from long-running command

Advanced data browsing with sorting and filtering

Score

Reviews

create-pr

orpc-contract-first

component-refactoring

web-design-guidelines

frontend-code-review

frontend-testing

stata-mcp

SKILL.md

name: stata-mcp description: Run or debug Stata workflows through the local io.github.tmonk/mcp-stata server. Use when users mention Stata commands, .do files, r()/e() results, dataset inspection, Stata graph exports, or data browsing with sorting/filtering.

Stata MCP Skill

Instructions

Tool quick reference

Command Execution

Data Loading & Inspection

Graph Management

Help & Results

UI Data Browser

Cancellation

Error Reporting

MCP Resources

Graph review workflow

Examples

Run a regression

Export a histogram

Debug a do-file

Inspect data structure

Read log output from long-running command

Advanced data browsing with sorting and filtering

Score

Reviews

Related

Related Skills

create-pr

orpc-contract-first

component-refactoring

web-design-guidelines

frontend-code-review

frontend-testing