Back to list
benchflow-ai

usgs-data-download

by benchflow-ai

SkillsBench evaluates how well skills work and how effective agents are at using them

251🍴 170📅 Jan 23, 2026

SKILL.md


name: usgs-data-download description: Download water level data from USGS using the dataretrieval package. Use when accessing real-time or historical streamflow data, downloading gage height or discharge measurements, or working with USGS station IDs. license: MIT

USGS Data Download Guide

Overview

This guide covers downloading water level data from USGS using the dataretrieval Python package. USGS maintains thousands of stream gages across the United States that record water levels at 15-minute intervals.

Installation

pip install dataretrieval

The NWIS module is reliable and straightforward for accessing gage height data.

from dataretrieval import nwis

# Get instantaneous values (15-min intervals)
df, meta = nwis.get_iv(
    sites='<station_id>',
    start='<start_date>',
    end='<end_date>',
    parameterCd='00065'
)

# Get daily values
df, meta = nwis.get_dv(
    sites='<station_id>',
    start='<start_date>',
    end='<end_date>',
    parameterCd='00060'
)

# Get site information
info, meta = nwis.get_info(sites='<station_id>')

Parameter Codes

CodeParameterUnitDescription
00065Gage heightfeetWater level above datum
00060DischargecfsStreamflow volume

nwis Module Functions

FunctionDescriptionData Frequency
nwis.get_iv()Instantaneous values~15 minutes
nwis.get_dv()Daily valuesDaily
nwis.get_info()Site informationN/A
nwis.get_stats()Statistical summariesN/A
nwis.get_peaks()Annual peak dischargeAnnual

Returned DataFrame Structure

The DataFrame has a datetime index and these columns:

ColumnDescription
site_noStation ID
00065Water level value
00065_cdQuality code (can ignore)

Downloading Multiple Stations

from dataretrieval import nwis

station_ids = ['<id_1>', '<id_2>', '<id_3>']
all_data = {}

for site_id in station_ids:
    try:
        df, meta = nwis.get_iv(
            sites=site_id,
            start='<start_date>',
            end='<end_date>',
            parameterCd='00065'
        )
        if len(df) > 0:
            all_data[site_id] = df
    except Exception as e:
        print(f"Failed to download {site_id}: {e}")

print(f"Successfully downloaded: {len(all_data)} stations")

Extracting the Value Column

# Find the gage height column (excludes quality code column)
gage_col = [c for c in df.columns if '00065' in str(c) and '_cd' not in str(c)]

if gage_col:
    water_levels = df[gage_col[0]]
    print(water_levels.head())

Common Issues

IssueCauseSolution
Empty DataFrameStation has no data for date rangeTry different dates or use get_iv()
get_dv() returns emptyNo daily gage height dataUse get_iv() and aggregate
Connection errorNetwork issueWrap in try/except, retry
Rate limitedToo many requestsAdd delays between requests

Best Practices

  • Always wrap API calls in try/except for failed downloads
  • Check len(df) > 0 before processing
  • Station IDs are 8-digit strings with leading zeros (e.g., '04119000')
  • Use get_iv() for gage height, as daily data is often unavailable
  • Filter columns to exclude quality code columns (_cd)
  • Break up large requests into smaller time periods to avoid timeouts

Score

Total Score

65/100

Based on repository quality metrics

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

+10
説明文

100文字以上の説明がある

0/10
人気

GitHub Stars 100以上

+5
最近の活動

3ヶ月以内に更新

+5
フォーク

10回以上フォークされている

+5
Issue管理

オープンIssueが50未満

0/5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

0/5

Reviews

💬

Reviews coming soon