Repository avatar
AI Tools
v1.0.3
active

protein-mcp-server

io.github.cyanheads/protein-mcp-server

MCP Server for 3D protein structural data retrieval & analysis from RCSB PDB, PDBe, and UniProt.

Documentation

protein-mcp-server

A powerful Model Context Protocol server providing programmatic access to 3D protein structural data from RCSB PDB, PDBe, and UniProt. Features multi-provider orchestration, comprehensive structural analysis tools, and full observability. Built for performance and scalability, with native support for serverless deployment (Cloudflare Workers).

Version MCP Spec MCP SDK License Status TypeScript Bun


🛠️ Tools Overview & Roadmap

This server provides a powerful suite of tools for accessing and analyzing protein structure data.

Tool NameStatusDescription
protein_search_structuresTestingSearches for protein structures using keywords, filters, pagination, and sorting.
protein_get_structureTestingFetches one or more protein structures by their PDB IDs, returning either full data or concise summaries.
protein_find_similarTestingFinds proteins with similar sequence or structure.
protein_track_ligandsTestingFinds protein structures containing specific ligands, cofactors, or drugs.
protein_compare_structures🟡 In DevelopmentPerforms a detailed side-by-side comparison of 2-10 protein structures.
protein_analyze_collection🟡 In DevelopmentPerforms statistical analysis on the protein structure database.

protein_search_structures

Search and discover protein structures from the Protein Data Bank (PDB) using a wide range of criteria.

Key Features:

  • Free-text search for protein names, keywords, or PDB IDs.
  • Filter by source organism, experimental method, and resolution.
  • Pagination support for navigating large result sets.
  • Returns rich metadata including title, organism, method, and resolution.

Example Use Cases:

  • "Find all human kinase structures with resolution better than 2.0 Å"
  • "Show me all cryo-EM structures of the SARS-CoV-2 spike protein"
  • "List structures of hemoglobin from Escherichia coli"

protein_get_structure

Retrieve detailed information for specific protein structures by their PDB ID.

Key Features:

  • Fetch single or multiple structures by their 4-character PDB ID.
  • Choose between different data formats: mmCIF (default), PDB, PDBML, or JSON.
  • Selectively include or exclude 3D coordinates, experimental data, and functional annotations.
  • Provides access to atomic coordinates, chain information, and experimental details like R-factors and unit cell parameters.

Example Use Cases:

  • "Get the full structure data for PDB ID 1ABC in mmCIF format"
  • "Show me the metadata and chain information for 2GBP, but exclude the coordinates"
  • "What were the experimental method and resolution for structure 6M0J?"

protein_compare_structures

Compare and contrast multiple protein structures to analyze conformational changes and structural relationships.

Key Features:

  • Side-by-side comparison of 2 to 10 structures.
  • Utilizes standard alignment algorithms like CEAlign and TM-Align.
  • Calculates key metrics including RMSD, TM-score, and sequence identity.
  • Can optionally generate a visualization script for PyMOL or ChimeraX.

Example Use Cases:

  • "Compare the active site conformations of HIV protease in structures 1HVR and 1HVS"
  • "Align structures 2GBP and 3AXO and report the RMSD"
  • "Analyze the conformational differences between the open and closed states of a protein"

protein_find_similar

Discover structurally or sequentially related proteins based on a query.

Key Features:

  • Similarity search by sequence (like BLAST) or structure (like DALI).
  • Use a PDB ID, a FASTA sequence, or raw structure data as the query.
  • Set thresholds for sequence identity, E-value, TM-score, or RMSD to refine results.
  • Identifies homologous proteins, recognizes structural folds, and supports evolutionary analysis.

Example Use Cases:

  • "Find proteins structurally similar to PDB ID 1ABC"
  • "What proteins have a sequence identity greater than 90% to this FASTA sequence?"
  • "Discover other proteins with a similar fold to my query structure"

protein_track_ligands

Identify protein structures that bind to specific small molecules, such as drugs, inhibitors, or cofactors.

Key Features:

  • Search for ligands by common name, chemical ID, or SMILES string.
  • Filter results by the bound protein's name, organism, or experimental method.
  • Optionally include details of the binding site, including interacting residues.
  • Essential for drug discovery, pharmacology, and molecular docking workflows.

Example Use Cases:

  • "Find all human protein structures that bind to ATP"
  • "Show me structures of Cyclin-dependent kinase 2 in complex with an inhibitor"
  • "What are the binding site residues for glucose in hexokinase?"

protein_analyze_collection

Perform statistical analysis on the entire Protein Data Bank to uncover trends and distributions.

Key Features:

  • Aggregate data based on fold classification, function, organism, or experimental method.
  • Apply filters to narrow the analysis to specific subsets of the database.
  • Group results by a secondary dimension (e.g., year) to visualize trends over time.

Example Use Cases:

  • "What are the most common structural folds found in membrane proteins?"
  • "Show a yearly trend of the number of structures determined by cryo-EM"
  • "Which organisms are most represented in the PDB for the years 2020-2023?"

✨ Features

This server is built on the mcp-ts-template and inherits its rich feature set:

  • Declarative Tools: Define agent capabilities in single, self-contained files. The framework handles registration, validation, and execution.
  • Robust Error Handling: A unified McpError system ensures consistent, structured error responses.
  • Pluggable Authentication: Secure your server with zero-fuss support for none, jwt, or oauth modes.
  • Abstracted Storage: Swap storage backends (in-memory, filesystem, Supabase, Cloudflare KV/R2) without changing business logic.
  • Full-Stack Observability: Deep insights with structured logging (Pino) and optional, auto-instrumented OpenTelemetry for traces and metrics.
  • Dependency Injection: Built with tsyringe for a clean, decoupled, and testable architecture.
  • Edge-Ready: Write code once and run it seamlessly on your local machine or at the edge on Cloudflare Workers.

🚀 Getting Started

MCP Client Settings/Configuration

Add the following to your MCP Client configuration file (e.g., cline_mcp_settings.json).

{
  "mcpServers": {
    "protein-mcp-server": {
      "command": "bunx",
      "args": ["protein-mcp-server@latest"],
      "env": {
        "MCP_LOG_LEVEL": "info"
      }
    }
  }
}

Prerequisites

Installation

  1. Clone the repository:
git clone https://github.com/cyanheads/protein-mcp-server.git
  1. Navigate into the directory:
cd protein-mcp-server
  1. Install dependencies:
bun install

⚙️ Configuration

All configuration is centralized and validated at startup in src/config/index.ts. Key environment variables in your .env file include:

VariableDescriptionDefault
MCP_TRANSPORT_TYPEThe transport to use: stdio or http.http
MCP_HTTP_PORTThe port for the HTTP server.3010
MCP_AUTH_MODEAuthentication mode: none, jwt, or oauth.none
STORAGE_PROVIDER_TYPEStorage backend: in-memory, filesystem, supabase, cloudflare-kv, r2.in-memory
PROTEIN_PRIMARY_PROVIDERThe primary data source for protein data.rcsb
OTEL_ENABLEDSet to true to enable OpenTelemetry.false
LOG_LEVELThe minimum level for logging.info

▶️ Running the Server

Local Development

  • Build and run the production version:

    # One-time build
    bun rebuild
    
    # Run the built server
    bun start:http
    # or
    bun start:stdio
    
  • Run checks and tests:

    bun devcheck # Lints, formats, type-checks, and more
    bun test # Runs the test suite
    

Cloudflare Workers

  1. Build the Worker bundle:
bun build:worker
  1. Run locally with Wrangler:
bun deploy:dev
  1. Deploy to Cloudflare: sh bun deploy:prod > Note: The wrangler.toml file is pre-configured to enable nodejs_compat for best results.

📂 Project Structure

DirectoryPurpose & Contents
src/mcp-server/tools/definitionsYour tool definitions (*.tool.ts). This is where you add new capabilities.
src/mcp-server/resources/definitionsYour resource definitions (*.resource.ts). This is where you add new data sources.
src/services/proteinOrchestration and provider logic for protein data sources (RCSB, PDBe).
src/storageThe StorageService abstraction and all storage provider implementations.
src/containerDependency injection container registrations and tokens.
src/utilsCore utilities for logging, error handling, performance, security, and telemetry.
src/configEnvironment variable parsing and validation with Zod.
tests/Unit and integration tests, mirroring the src/ directory structure.

🧑‍💻 Agent Development Guide

For a strict set of rules when using this template with an AI agent, please refer to AGENTS.md. Key principles include:

  • Logic Throws, Handlers Catch: Never use try/catch in your tool/resource logic. Throw an McpError instead.
  • Use Elicitation for Missing Input: If a tool requires user input that wasn't provided, use the elicitInput function from the SdkContext to ask the user for it.
  • Pass the Context: Always pass the RequestContext object through your call stack.
  • Use the Barrel Exports: Register new tools and resources only in the index.ts barrel files.

🤝 Contributing

Issues and pull requests are welcome! If you plan to contribute, please run the local checks and tests before submitting your PR.

bun run devcheck
bun test

📜 License

This project is licensed under the Apache 2.0 License. See the LICENSE file for details.

NPM
protein-mcp-server
Install Command
npm install protein-mcp-server
Runtime: bun
NPM
protein-mcp-server
Install Command
npm install protein-mcp-server
Runtime: bun