
bib-cite-web
by Mearman
Plugin marketplace distributing extensions that add skills, commands, hooks and custom agents to the code environment.
SKILL.md
name: bib-cite-web description: Create bibliography citations from web page URLs with automatic Wayback Machine archival and metadata extraction. Use when the user asks to cite a website, create a citation for a URL, archive and cite a web page, or generate a bibliography entry from a web address.
Web Page Citation Creator
Create bibliography citations from web page URLs with automatic archival snapshot and metadata extraction.
Features
- Wayback Machine Integration: Automatically submits URLs to the Internet Archive for preservation
- Metadata Extraction: Extracts title, author, description, site name, and publish date from semantic HTML
- Multiple Formats: Outputs citations in BibTeX or CSL JSON format
- Smart Citation Keys: Generates citation keys from domain + author + year
Usage
npx tsx plugins/bib/scripts/cite-web.ts <url>
npx tsx plugins/bib/scripts/cite-web.ts <url> --format=bibtex
npx tsx plugins/bib/scripts/cite-web.ts <url> --no-wayback
npx tsx plugins/bib/scripts/cite-web.ts <url> --output=citations.bib
Metadata Extraction
The script extracts metadata from semantic HTML tags:
Title
<title>tag- Open Graph:
<meta property="og:title"> - Twitter Card:
<meta name="twitter:title"> - Standard:
<meta name="title">
Author
<meta name="author">- Open Graph:
<meta property="og:author">or<meta property="article:author"> - Twitter Card:
<meta name="twitter:creator">
Description
<meta name="description">- Open Graph:
<meta property="og:description"> - Twitter Card:
<meta name="twitter:description">
Site Name
- Open Graph:
<meta property="og:site_name"> <meta name="application-name">
Published Date
- Open Graph:
<meta property="article:published_time"> <meta name="publish-date">or<meta name="date">
Arguments
- Positional argument: URL to cite
--file <path>: Read URL from file (uses first line)--format <format>: Output format (default: bibtex)bibtexorbib: BibTeX formatcsl,json, orcsl-json: CSL JSON format
--no-wayback: Skip Wayback Machine submission (faster, but no archive)--output <file>: Write output to file (default: stdout)
Output Formats
BibTeX
@online{smithexample2024,
author = {John Smith},
title = {Example Article Title},
url = {https://example.com/article},
urldate = {2024-03-15},
year = {2024}
}
CSL JSON
[
{
"id": "smithexample2024",
"type": "webpage",
"title": "Example Article Title",
"author": [{"literal": "John Smith"}],
"URL": "https://example.com/article",
"accessed": {"date-parts": [[2024, 3, 15]]},
"archive-url": "https://web.archive.org/web/20240315123456/https://example.com/article"
}
]
Examples
Basic citation
npx tsx plugins/bib/scripts/cite-web.ts "https://example.com/article"
Output:
@online{example2024,
title = {Example Article Title},
url = {https://example.com/article},
urldate = {2024-03-15}
}
With Wayback archival
npx tsx plugins/bib/scripts/cite-web.ts "https://blog.example.com/post"
Output includes archive URL:
@online{example2024,
title = {Blog Post Title},
url = {https://blog.example.com/post},
urldate = {2024-03-15},
note = {Archived at https://web.archive.org/web/20240315123456/...}
}
CSL JSON format
npx tsx plugins/bib/scripts/cite-web.ts "https://docs.example.com" --format=csl
Skip archival (faster)
npx tsx plugins/bib/scripts/cite-web.ts "https://example.com" --no-wayback
Save to file
npx tsx plugins/bib/scripts/cite-web.ts "https://example.com" --output=citations.bib
Batch processing
# Create file with URLs (one per line)
echo "https://example.com/article1" > urls.txt
# Cite each URL
while read url; do
npx tsx plugins/bib/scripts/cite-web.ts "$url" >> citations.bib
done < urls.txt
Citation Key Generation
Citation keys are automatically generated from:
- Domain name:
example.com→example - Author (if available):
John Smith→smith - Year: Archive date or publish date or current year
Examples:
https://blog.example.com/postby John Smith (2024) →smithexample2024https://example.com/article(no author, 2023) →example2023
Wayback Machine Integration
By default, the script submits URLs to the Internet Archive's Wayback Machine for preservation:
- Submission: Sends URL to
https://web.archive.org/save/<url> - Archive URL: Extracts the permanent archive URL from response
- Archive Date: Records the snapshot timestamp
- Fallback: If submission fails, continues without archive
The archive URL is included in the citation:
- BibTeX: In
notefield or customarchiveurl/archivedatefields - CSL JSON: In
archive-urlfield
Skip archival with --no-wayback for faster execution when archiving isn't needed.
Error Handling
The script handles various error scenarios:
- Invalid URL: Validates URL format before processing
- Fetch failures: Reports HTTP errors with status codes
- Missing metadata: Falls back to "Untitled" for missing titles
- Wayback failures: Continues without archive if submission fails
- No author: Omits author field if not found
Errors are written to stderr, while citations are written to stdout (or file).
Limitations
- JavaScript-heavy sites: May not extract metadata from dynamically rendered content
- Paywalls: Cannot access content behind authentication
- Rate limiting: Wayback Machine may rate-limit submissions
- No PDF support: Only HTML pages (use separate tool for PDFs)
- Simple parsing: Uses regex matching, not full DOM parsing
For complex pages or JavaScript-rendered content, consider:
- Using
--no-waybackto skip archival - Manually editing the citation after generation
- Using browser developer tools to inspect metadata tags
Related Skills
- bib-create: Create bibliography entries interactively
- bib-read: View existing bibliography entries
- bib-convert: Convert between bibliography formats
- wayback-submit: Submit URLs to Wayback Machine without citation generation
Score
Total Score
Based on repository quality metrics
SKILL.mdファイルが含まれている
ライセンスが設定されている
100文字以上の説明がある
GitHub Stars 100以上
1ヶ月以内に更新
10回以上フォークされている
オープンIssueが50未満
プログラミング言語が設定されている
1つ以上のタグが設定されている
Reviews
Reviews coming soon

