
bib-cite-web
by Mearman
Plugin marketplace distributing extensions that add skills, commands, hooks and custom agents to the code environment.
SKILL.md
name: bib-cite-web description: Create bibliography citations from web page URLs with automatic Wayback Machine archival and metadata extraction. Use when the user asks to cite a website, create a citation for a URL, archive and cite a web page, or generate a bibliography entry from a web address.
Web Page Citation Creator
Create bibliography citations from web page URLs with automatic archival snapshot and metadata extraction.
Features
- Wayback Machine Integration: Automatically submits URLs to the Internet Archive for preservation
- Metadata Extraction: Extracts title, author, description, site name, and publish date from semantic HTML
- Multiple Formats: Outputs citations in BibTeX or CSL JSON format
- Smart Citation Keys: Generates citation keys from domain + author + year
Usage
npx tsx plugins/bib/scripts/cite-web.ts <url>
npx tsx plugins/bib/scripts/cite-web.ts <url> --format=bibtex
npx tsx plugins/bib/scripts/cite-web.ts <url> --no-wayback
npx tsx plugins/bib/scripts/cite-web.ts <url> --output=citations.bib
Metadata Extraction
The script extracts metadata from semantic HTML tags:
Title
<title>tag- Open Graph:
<meta property="og:title"> - Twitter Card:
<meta name="twitter:title"> - Standard:
<meta name="title">
Author
<meta name="author">- Open Graph:
<meta property="og:author">or<meta property="article:author"> - Twitter Card:
<meta name="twitter:creator">
Description
<meta name="description">- Open Graph:
<meta property="og:description"> - Twitter Card:
<meta name="twitter:description">
Site Name
- Open Graph:
<meta property="og:site_name"> <meta name="application-name">
Published Date
- Open Graph:
<meta property="article:published_time"> <meta name="publish-date">or<meta name="date">
Arguments
- Positional argument: URL to cite
--file <path>: Read URL from file (uses first line)--format <format>: Output format (default: bibtex)bibtexorbib: BibTeX formatcsl,json, orcsl-json: CSL JSON format
--no-wayback: Skip Wayback Machine submission (faster, but no archive)--output <file>: Write output to file (default: stdout)
Output Formats
BibTeX
@online{smithexample2024,
author = {John Smith},
title = {Example Article Title},
url = {https://example.com/article},
urldate = {2024-03-15},
year = {2024}
}
CSL JSON
[
{
"id": "smithexample2024",
"type": "webpage",
"title": "Example Article Title",
"author": [{"literal": "John Smith"}],
"URL": "https://example.com/article",
"accessed": {"date-parts": [[2024, 3, 15]]},
"archive-url": "https://web.archive.org/web/20240315123456/https://example.com/article"
}
]
Examples
Basic citation
npx tsx plugins/bib/scripts/cite-web.ts "https://example.com/article"
Output:
@online{example2024,
title = {Example Article Title},
url = {https://example.com/article},
urldate = {2024-03-15}
}
With Wayback archival
npx tsx plugins/bib/scripts/cite-web.ts "https://blog.example.com/post"
Output includes archive URL:
@online{example2024,
title = {Blog Post Title},
url = {https://blog.example.com/post},
urldate = {2024-03-15},
note = {Archived at https://web.archive.org/web/20240315123456/...}
}
CSL JSON format
npx tsx plugins/bib/scripts/cite-web.ts "https://docs.example.com" --format=csl
Skip archival (faster)
npx tsx plugins/bib/scripts/cite-web.ts "https://example.com" --no-wayback
Save to file
npx tsx plugins/bib/scripts/cite-web.ts "https://example.com" --output=citations.bib
Batch processing
# Create file with URLs (one per line)
echo "https://example.com/article1" > urls.txt
# Cite each URL
while read url; do
npx tsx plugins/bib/scripts/cite-web.ts "$url" >> citations.bib
done < urls.txt
Citation Key Generation
Citation keys are automatically generated from:
- Domain name:
example.com→example - Author (if available):
John Smith→smith - Year: Archive date or publish date or current year
Examples:
https://blog.example.com/postby John Smith (2024) →smithexample2024https://example.com/article(no author, 2023) →example2023
Wayback Machine Integration
By default, the script submits URLs to the Internet Archive's Wayback Machine for preservation:
- Submission: Sends URL to
https://web.archive.org/save/<url> - Archive URL: Extracts the permanent archive URL from response
- Archive Date: Records the snapshot timestamp
- Fallback: If submission fails, continues without archive
The archive URL is included in the citation:
- BibTeX: In
notefield or customarchiveurl/archivedatefields - CSL JSON: In
archive-urlfield
Skip archival with --no-wayback for faster execution when archiving isn't needed.
Error Handling
The script handles various error scenarios:
- Invalid URL: Validates URL format before processing
- Fetch failures: Reports HTTP errors with status codes
- Missing metadata: Falls back to "Untitled" for missing titles
- Wayback failures: Continues without archive if submission fails
- No author: Omits author field if not found
Errors are written to stderr, while citations are written to stdout (or file).
Limitations
- JavaScript-heavy sites: May not extract metadata from dynamically rendered content
- Paywalls: Cannot access content behind authentication
- Rate limiting: Wayback Machine may rate-limit submissions
- No PDF support: Only HTML pages (use separate tool for PDFs)
- Simple parsing: Uses regex matching, not full DOM parsing
For complex pages or JavaScript-rendered content, consider:
- Using
--no-waybackto skip archival - Manually editing the citation after generation
- Using browser developer tools to inspect metadata tags
Related Skills
- bib-create: Create bibliography entries interactively
- bib-read: View existing bibliography entries
- bib-convert: Convert between bibliography formats
- wayback-submit: Submit URLs to Wayback Machine without citation generation
スコア
総合スコア
リポジトリの品質指標に基づく評価
SKILL.mdファイルが含まれている
ライセンスが設定されている
100文字以上の説明がある
GitHub Stars 100以上
3ヶ月以内に更新
10回以上フォークされている
オープンIssueが50未満
プログラミング言語が設定されている
1つ以上のタグが設定されている
レビュー
レビュー機能は近日公開予定です

