Repository avatar
AI Tools
v1.0.2
active

superfetch

io.github.j0hanz/superfetch

Intelligent web content fetcher MCP server that converts HTML to clean, AI-readable JSONL format

Documentation

superFetch MCP Server

SuperFetch MCP Logo

npm version Node.js TypeScript

One-Click Install

Install with NPX in VS Code Install with NPX in VS Code Insiders

Install in Cursor

A Model Context Protocol (MCP) server that fetches web pages, extracts readable content with Mozilla Readability, and returns AI-friendly Markdown.

Quick Start | Tool | Resources | Configuration | Security | Development

Published to MCP Registry - Search for io.github.j0hanz/superfetch


[!CAUTION] This server can access URLs on behalf of AI assistants. Built-in SSRF protection blocks private IP ranges and cloud metadata endpoints, but exercise caution when deploying in sensitive environments.

Features

FeatureDescription
Smart extractionMozilla Readability with quality gates to strip boilerplate when it improves results
Clean MarkdownMarkdown output with optional YAML frontmatter (title + source)
Raw content handlingPreserves raw markdown/text, detects common text extensions, and rewrites GitHub/GitLab/Bitbucket/Gist URLs to raw
Built-in cachingIn-memory cache with TTL, max keys, and resource subscriptions
Resilient fetchingRedirect handling with validation, timeouts, and response size limits
Security firstURL validation plus SSRF/DNS/IP blocklists
HTTP modeStatic token or OAuth auth, session management, rate limiting, host/origin validation

Quick Start

Add superFetch to your MCP client configuration - no installation required.

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "superFetch": {
      "command": "npx",
      "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"]
    }
  }
}

VS Code

Add to .vscode/mcp.json in your workspace:

{
  "servers": {
    "superFetch": {
      "command": "npx",
      "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"]
    }
  }
}

With Custom Configuration

Add environment variables in your MCP client config under env. See Configuration or CONFIGURATION.md for all available options and presets.

Cursor

  1. Open Cursor Settings
  2. Go to Features > MCP Servers
  3. Click "+ Add new global MCP server"
  4. Add this configuration:
{
  "mcpServers": {
    "superFetch": {
      "command": "npx",
      "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"]
    }
  }
}

Tip (Windows): If you encounter issues, try: cmd /c "npx -y @j0hanz/superfetch@latest --stdio"

Codex IDE

Add to your ~/.codex/config.toml file:

Basic Configuration:

[mcp_servers.superfetch]
command = "npx"
args = ["-y", "@j0hanz/superfetch@latest", "--stdio"]

With Environment Variables: See CONFIGURATION.md for examples.

Access config file: Click the gear icon -> "Codex Settings > Open config.toml"

Documentation: Codex MCP Guide

Cline (VS Code Extension)

Open the Cline MCP settings file:

macOS:

code ~/Library/Application\ Support/Code/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json

Windows:

code %APPDATA%\Code\User\globalStorage\saoudrizwan.claude-dev\settings\cline_mcp_settings.json

Add the configuration:

{
  "mcpServers": {
    "superFetch": {
      "command": "npx",
      "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"],
      "disabled": false,
      "autoApprove": []
    }
  }
}
Windsurf

Add to ./codeium/windsurf/model_config.json:

{
  "mcpServers": {
    "superFetch": {
      "command": "npx",
      "args": ["-y", "@j0hanz/superfetch@latest", "--stdio"]
    }
  }
}
Claude Desktop (Config File Locations)

macOS:

# Open config file
open -e "$HOME/Library/Application Support/Claude/claude_desktop_config.json"

# Or with VS Code
code "$HOME/Library/Application Support/Claude/claude_desktop_config.json"

Windows:

code %APPDATA%\Claude\claude_desktop_config.json

Installation (Alternative)

Global Installation

npm install -g @j0hanz/superfetch

# Run in stdio mode
superfetch --stdio

# Run HTTP server (requires auth token)
superfetch

From Source

git clone https://github.com/j0hanz/super-fetch-mcp-server.git
cd super-fetch-mcp-server
npm install
npm run build

Running the Server

stdio Mode (direct MCP integration)
node dist/index.js --stdio
HTTP Mode (default)

HTTP mode requires authentication. By default it binds to 127.0.0.1. To listen on all interfaces, set HOST=0.0.0.0 or HOST=:: and configure OAuth (remote bindings require OAuth). Other non-loopback HOST values are rejected.

API_KEY=supersecret npx -y @j0hanz/superfetch@latest
# Server runs at http://127.0.0.1:3000

Windows (PowerShell):

$env:API_KEY = "supersecret"
npx -y @j0hanz/superfetch@latest

For multiple static tokens, set ACCESS_TOKENS (comma/space separated).

Auth is required for /mcp and /mcp/downloads via Authorization: Bearer <token> (static mode also accepts X-API-Key).

Endpoints:

  • GET /health (no auth; returns status, name, version, uptime)
  • POST /mcp (auth required)
  • GET /mcp (auth required; SSE stream; requires Accept: text/event-stream)
  • DELETE /mcp (auth required)
  • GET /mcp/downloads/:namespace/:hash (auth required)

Sessions are managed via the mcp-session-id header (see HTTP Mode Details).


Available Tools

Tool Response Notes

The tool returns structuredContent with url, optional title, and markdown when inline content is available. On errors, error is included instead of content.

The response includes:

  • a text block containing JSON of structuredContent
  • a resource block embedding markdown when inline content is available (always in stdio mode)
  • when content exceeds the inline limit and cache is enabled, a resource_link block pointing to superfetch://cache/... (inline markdown may be omitted)
  • error responses set isError: true and return structuredContent with error and url

fetch-url

Fetches a webpage and converts it to clean Markdown format with optional frontmatter.

ParameterTypeDefaultDescription
urlstringrequiredURL to fetch

Example structuredContent:

{
  "url": "https://example.com/docs",
  "title": "Documentation",
  "markdown": "---\ntitle: Documentation\n---\n\n# Getting Started\n\nWelcome..."
}

Error response:

{
  "url": "https://example.com/broken",
  "error": "Failed to fetch: 404 Not Found"
}

Large Content Handling

  • Inline markdown is capped at 20,000 characters (maxInlineContentChars).
  • Stdio mode: full markdown is embedded as a resource block.
  • HTTP mode: if content exceeds the inline limit and cache is enabled, the response includes a resource_link to superfetch://cache/... (no embedded markdown). If cache is disabled, the inline markdown is truncated with ...[truncated].
  • Upstream fetch size is capped at 10 MB of HTML; larger responses fail.

Resources

URIDescription
superfetch://cache/{namespace}/{urlHash}Cached content entry (namespace: markdown)

Resource listings enumerate cached entries, and subscriptions notify clients when cache entries update.


Download Endpoint (HTTP Mode)

When running in HTTP mode, cached content can be downloaded directly. Downloads are available only when cache is enabled.

Endpoint

GET /mcp/downloads/:namespace/:hash
  • namespace: markdown
  • Auth required (Authorization: Bearer <token>; in static token mode, X-API-Key is accepted)

Response Headers

HeaderValue
Content-Typetext/markdown; charset=utf-8
Content-Dispositionattachment; filename="<name>"
Cache-Controlprivate, max-age=<CACHE_TTL>

Example Usage

curl -H "Authorization: Bearer $TOKEN" \
  http://localhost:3000/mcp/downloads/markdown/abc123.def456 \
  -o article.md

Error Responses

StatusCodeDescription
400BAD_REQUESTInvalid namespace or hash format
404NOT_FOUNDContent not found or expired
503SERVICE_UNAVAILABLEDownload service disabled

Configuration

Set environment variables in your MCP client env or in the shell before starting the server.

Core Server Settings

VariableDefaultDescription
HOST127.0.0.1HTTP bind address
PORT3000HTTP server port (1024-65535)
USER_AGENTsuperFetch-MCP/2.0User-Agent header for outgoing requests
CACHE_ENABLEDtrueEnable response caching
CACHE_TTL3600Cache TTL in seconds (60-86400)
LOG_LEVELinfodebug, info, warn, error
ALLOWED_HOSTS(empty)Additional allowed Host/Origin values (comma/space separated)

Auth (HTTP Mode)

VariableDefaultDescription
AUTH_MODEautostatic or oauth. Auto-selects OAuth if any OAUTH URL set
ACCESS_TOKENS(empty)Comma/space-separated static bearer tokens
API_KEY(empty)Adds a static bearer token and enables X-API-Key header

Static mode requires at least one token (ACCESS_TOKENS or API_KEY).

OAuth (HTTP Mode)

Required when AUTH_MODE=oauth (or auto-selected by presence of OAuth URLs):

VariableDefaultDescription
OAUTH_ISSUER_URL-OAuth issuer
OAUTH_AUTHORIZATION_URL-Authorization endpoint
OAUTH_TOKEN_URL-Token endpoint
OAUTH_INTROSPECTION_URL-Introspection endpoint

Optional:

VariableDefaultDescription
OAUTH_REVOCATION_URL-Revocation endpoint
OAUTH_REGISTRATION_URL-Dynamic client registration endpoint
OAUTH_RESOURCE_URLhttp://<host>:<port>/mcpProtected resource URL
OAUTH_REQUIRED_SCOPES(empty)Required scopes (comma/space separated)
OAUTH_CLIENT_ID-Client ID for introspection
OAUTH_CLIENT_SECRET-Client secret for introspection
OAUTH_INTROSPECTION_TIMEOUT_MS5000Introspection timeout (1000-30000)

Fixed Limits (Not Configurable via env)

  • Fetch timeout: 15 seconds
  • Max redirects: 5
  • Max HTML response size: 10 MB
  • Inline markdown limit: 20,000 characters
  • Cache max entries: 100
  • Session TTL: 30 minutes
  • Session init timeout: 10 seconds
  • Max sessions: 200
  • Rate limit: 100 req/min per IP (60s window)

See CONFIGURATION.md for preset examples and quick-start snippets.


HTTP Mode Details

HTTP mode uses the MCP Streamable HTTP transport. The workflow is:

  1. POST /mcp with an initialize request and no mcp-session-id header.
  2. The server returns mcp-session-id in the response headers.
  3. Use that header for subsequent POST /mcp, GET /mcp, and DELETE /mcp requests.

If the mcp-protocol-version header is missing, the server defaults it to 2025-03-26. Supported versions are 2025-03-26 and 2025-11-25.

GET /mcp and DELETE /mcp require mcp-session-id. POST /mcp without an initialize request will return 400.

Additional HTTP transport notes:

  • GET /mcp requires Accept: text/event-stream (otherwise 406).
  • JSON-RPC batch requests are not supported (400).

If the server reaches its session cap (200), it evicts the oldest session when possible; otherwise it returns a 503.

Host and Origin headers are always validated. Allowed values include loopback hosts, the configured HOST (if not a wildcard), and any entries in ALLOWED_HOSTS. When binding to 0.0.0.0 or ::, set ALLOWED_HOSTS to the hostnames clients will send.


Security

SSRF Protection

Blocked destinations include:

  • Loopback and unspecified addresses (127.0.0.0/8, ::1, 0.0.0.0, ::)
  • Private/ULA ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, fc00::/7)
  • Link-local and shared address space (169.254.0.0/16, 100.64.0.0/10, fe80::/10)
  • Multicast/reserved ranges (224.0.0.0/4, 240.0.0.0/4, ff00::/8)
  • IPv6 transition ranges (64:ff9b::/96, 64:ff9b:1::/48, 2001::/32, 2002::/16)
  • Cloud metadata endpoints (AWS/GCP/Azure/Alibaba) like 169.254.169.254, metadata.google.internal, metadata.azure.com, 100.100.100.200, instance-data
  • Internal suffixes such as .local and .internal

DNS resolution is performed and blocked if any resolved IP matches a blocked range.

URL Validation

  • Only http and https URLs
  • No embedded credentials in URLs
  • Max URL length: 2048 characters
  • Hostnames ending in .local or .internal are rejected

Host/Origin Validation (HTTP Mode)

  • Host header must match loopback, configured HOST (if not a wildcard), or ALLOWED_HOSTS
  • Origin header (when present) is validated against the same allow-list

Rate Limiting

Rate limiting applies to /mcp and /mcp/downloads (100 req/min per IP, 60s window). OPTIONS requests are not rate-limited.


Development

Scripts

CommandDescription
npm run devDevelopment server with hot reload
npm run buildCompile TypeScript
npm startProduction server
npm run lintRun ESLint
npm run lint:fixAuto-fix lint issues
npm run type-checkTypeScript type checking
npm run formatFormat with Prettier
npm testRun Node test runner (builds dist)
npm run test:coverageRun tests with experimental coverage
npm run knipFind unused exports/dependencies
npm run knip:fixAuto-fix unused code
npm run inspectorLaunch MCP Inspector

Note: Tests run via node --test with --experimental-transform-types to execute .ts test files. Node will emit an experimental warning.

Tech Stack

CategoryTechnology
RuntimeNode.js >=20.12
LanguageTypeScript 5.9
MCP SDK@modelcontextprotocol/sdk ^1.25.2
Content Extraction@mozilla/readability ^0.6.0
HTML Parsinglinkedom ^0.18.12
MarkdownTurndown ^7.2.2
HTTPExpress ^5.2.1, undici ^6.23.0
ValidationZod ^4.3.5

Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Ensure linting passes: npm run lint
  4. Run tests: npm test
  5. Commit changes: git commit -m 'Add amazing feature'
  6. Push: git push origin feature/amazing-feature
  7. Open a Pull Request

For examples of other MCP servers, see: github.com/modelcontextprotocol/servers