Repository avatar
AI Tools
v0.1.0-alpha.3
active

codeweaver

com.knitli/codeweaver

Semantic code search built for AI agents. Hybrid, AST-aware, context for 166 languages.

Documentation

CodeWeaver logo

CodeWeaver

The missing abstraction layer between AI and your code

Python Version License Alpha Release MCP Compatible

Installation β€’ Features β€’ How It Works β€’ Documentation β€’ Contributing


🎯 What is CodeWeaver?

CodeWeaver gives both humans and AI a deep, structural understanding of your project β€” not just text search, but real context: symbols, blocks, relationships, intent. MCP is just the delivery mechanism; CodeWeaver is the capability.

If you want AI that actually knows your code instead of guessing, this is the foundation.

⚠️ Alpha Release: CodeWeaver is in active development. Use it, break it, shape it, help make it better.


πŸ” Why CodeWeaver Exists

The Problems

ProblemImpact
πŸ”΄ Poor Context = Poor ResultsAgents are better at generating new code than understanding existing structure
πŸ’Έ Massive InefficiencyAgents read the same huge files repeatedly (50%+ context waste is common)
πŸ”§ Wrong AbstractionTools built for humans, not for how agents actually work
πŸ”’ No OwnershipExisting solutions locked into specific IDEs or agent clients like Claude Code

The result: Shallow, inconsistent, fragile context. And you don't control it.

CodeWeaver's Approach

βœ… One focused capability: Structural + semantic code understanding βœ… Hybrid search built for code, not text βœ… Works offline, airgapped, or degraded βœ… Deploy it however you want βœ… One great tool instead of 30 mediocre ones

πŸ“– Read the detailed rationale β†’


πŸš€ Getting Started

Quick Install

Using the CLI with uv:

# Add CodeWeaver to your project
uv add --prerelease allow --dev code-weaver

# Initialize config and MCP setup
cw init

# Verify setup
cw doctor

# Start the server
cw server

πŸ“ Note: cw init defaults to CodeWeaver's recommended profile, which requires:

🐳 Prefer Docker? See Docker setup guide β†’

MCP Configuration

CodeWeaver uses stdio transport by default, which proxies to the HTTP backend daemon. First start the daemon with codeweaver start, then MCP clients can connect via stdio.

cw init will add CodeWeaver to your project's .mcp.json:

{
  "mcpServers": {
    "codeweaver": {
      "type": "stdio",
      "cmd": "uv",
      "args": ["run", "codeweaver", "server"],
      "env": {"SOME_API_KEY_FOR_PROVIDERS": "value"}
    }
  }
}
{
  "mcpServers": {
    "codeweaver": {
      "type": "http",
      "url": "http://127.0.0.1:9328/mcp"
    }
  }
}


✨ Features

🧠 Smart Search

  • Hybrid search (sparse + dense)
  • AST-level understanding
  • Semantic relationships
  • Context-aware chunking

🌐 Language Support

  • 26 languages with full AST/semantic
  • 166+ languages with intelligent chunking
  • Family heuristics for smart parsing

πŸ”„ Resilient & Offline

  • Automatic fallback to local models
  • Works offline/airgapped
  • Health monitoring with graceful degradation
  • Better degraded than others' primary mode

βš™οΈ Flexible Configuration

  • ~15 config sources (TOML/YAML/JSON)
  • Cloud secret stores (AWS/Azure/GCP)
  • Hierarchical merging
  • Environment overrides

πŸ”Œ Provider Support

πŸ› οΈ Developer Experience

  • Live indexing with file watching
  • Low CPU overhead
  • Full CLI (cw / codeweaver)
  • Health, metrics, status endpoints

πŸ—οΈ How It Works

CodeWeaver combines AST-level understanding, semantic relationships, and hybrid embeddings (sparse + dense) to deliver both contextual and literal understanding of your codebase.

The goal: give AI the fragments it should see, not whatever it can grab.

Architecture Highlights

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Your Codebase                        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
                 β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚  Live Indexing  β”‚ ← AST parsing + semantic analysis
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚
                 β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚   Hybrid Vector Store   β”‚ ← Sparse + Dense embeddings
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚
             β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  Reranking Layer β”‚ ← Relevance optimization (heuristic and reranking model)
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚
             β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚   MCP Interface   β”‚ ← Simple "find_code" tool (`find_code("authentication api")`)
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚
             β–Ό
       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β”‚   AI    β”‚
       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

CLI Commands

cw start     # Start daemon in background (or --foreground)
cw stop      # Stop the daemon
cw server    # Run the MCP server (stdio by default)
cw doctor    # Full setup diagnostic
cw index     # Run indexing without server
cw init      # Set up MCP + config
cw list      # List providers, models, capabilities
cw status    # Live server status, health, index state
cw search    # Test the search engine
cw config    # View resolved configuration

Running as a System Service

Install CodeWeaver to start automatically on login:

cw init service          # Install and enable (systemd/launchd)
cw init service --uninstall  # Remove the service

πŸ“– Full CLI Guide β†’


πŸ“Š Current Status (Alpha)

Stability Snapshot: Strong Core, Prickly Edges

ComponentStatusNotes
πŸ”„ Live indexing & file watching⭐⭐⭐⭐Runs continuously; reliable
🌳 AST-based chunking⭐⭐⭐⭐Full semantic/AST for 26 languages
πŸ“ Context-aware chunking⭐⭐⭐⭐166+ languages, heuristic AST-lite
πŸ”Œ Provider integration⭐⭐⭐Voyage/FastEmbed reliable, others vary
πŸ›‘οΈ Automatic fallback⭐⭐⭐Seamless offline/degraded mode
πŸ’» CLI⭐⭐⭐⭐Core commands fully wired and tested
🐳 Docker build⭐⭐⭐Skip local Qdrant setup entirely
πŸ”— MCP interface⭐⭐⭐Core ops reliable, some edge cases
🌐 HTTP endpoints⭐⭐⭐Health, metrics, state, versions stable

Legend: ⭐⭐⭐⭐ = solid | ⭐⭐⭐ = works with quirks | ⭐⭐ = experimental | ⭐ = chaos gremlin


πŸ—ΊοΈ Roadmap

The enhancement issues describe detailed plans. Short version:

  • πŸ“š Way better docs – comprehensive guides and tutorials
  • πŸ€– AI-powered context curation – agents identify purpose and intent
  • πŸ”§ Data provider integration – Tavily, DuckDuckGo, Context7, and more
  • πŸ’‰ True DI system – replace existing registry
  • πŸ•ΈοΈ Advanced orchestration – integrate pydantic-graph

What Will Stay: One Tool

One tool. We give AI agents one simple tool: find_code.

Agents just need to explain what they need. No complex schemas. No novella-length prompts.


πŸ“š Documentation

For Users

For Developers

Product Philosophy


🀝 Contributing

PRs, issues, weird edge cases, feature requests β€” all welcome!

This is still early, and the best time to help shape the direction.

How to Contribute

  1. 🍴 Fork the repository
  2. 🌿 Create a feature branch
  3. ✨ Make your changes
  4. βœ… Add tests if applicable
  5. πŸ“ Update documentation
  6. πŸš€ Submit a PR

You'll need to agree to our Contributor License Agreement.

Found a Bug?

πŸ› Report it here – include as much detail as possible!


πŸ”— Links

Project

Company

Support the Project

We're a one-person company at the moment... and make no money... if you like CodeWeaver and want to keep it going, please consider sponsoring me πŸ˜„


πŸ“¦ Package Info

  • Python package: code-weaver πŸ‘ˆβ— note the hyphen
  • CLI commands: cw / codeweaver
  • Python requirement: β‰₯3.12 (tested on 3.12, 3.13, 3.14)
  • Entry point: codeweaver.cli.app:main

πŸ“„ License

Licensed under MIT OR Apache 2.0 β€” you choose! Some vendored code is Apache 2.0 only and some is MIT only. Everything is permissively licensed.

The project follows the REUSE specification. Every file has detailed licensing information, and we regularly generate a software bill of materials.


πŸ“Š Telemetry

The default includes very anonymized telemetry to improve CodeWeaver. See the implementation or read the README.

Opt out: export CODEWEAVER__TELEMETRY__DISABLE_TELEMETRY=true

Opt in to detailed feedback (helps us improve): export CODEWEAVER__TELEMETRY__TOOLS_OVER_PRIVACY=true

πŸ“‹ See our privacy policy


⚠️ API Stability

Warning: The API will change. Our priority right now is giving you and your coding agent an awesome tool.

To deliver on that, we can't get locked into API contracts while we're in alpha. We also want you to be able to extend and build on CodeWeaver β€” once we get to stable releases.


Built with ❀️ by Knitli

⬆ Back to top

PYPI
code-weaver
Install Command
pip install code-weaver
Runtime: uvx
OCI
docker.io/knitli/codeweaver:0.1.0-alpha.3
Install Command
docker pull docker.io/knitli/codeweaver:0.1.0-alpha.3:undefined
Runtime: docker