SHIPPED: CRM + BASEBALL EXAMPLES

Mycelium

Query-driven. Agent-native. Growing.

Open-source framework for agent-native data networks: scoped ecosystems where a supervisor routes lookups to specialist agents. MCP and CLI; CRM and Lahman baseball reference examples; custom networks via network create.

What ships today

Open-source toolkit for data your AI agents can query and enrich. Define a domain, connect an MCP client, ask for attributes — specialists research on miss and cache what they learn. Two full reference networks ship today: a person CRM and historic baseball stats.

814 automated tests on the framework repo. Early prototype — expect rough edges.

Ask, don’t ingest

Look up a contact, player, or team with simple key–value fields. No SQL and no public write API — you query, the network assembles an answer from what it already knows (and what it can find).

Answers get smarter over time

Repeat questions hit cache and come back fast. The first time you ask for something missing, background agents search the web, store the result, and serve it on the next ask.

Two-step lookups

Step one finds the right record (or tells you what’s ambiguous). Step two returns the fields you wanted. Built for real agent flows — clarify, retry, or confirm before you commit.

Plug in your LLM client

MCP server with describe_network, query_entity, and health_check. Connect Claude Desktop or any MCP client; one long-lived process per network.

CRM examples

Seeded contact list, a blank network that grows from the first lookup, and a demo that quotes a price before running paid research.

Baseball example

Historic MLB data (Lahman): look up players and teams, pull career stats, compute batting and pitching numbers on demand, trace where a value came from.

Build your own network

Scaffold a new domain with network create — categories, specialists, optional seed file. Copy-paste walkthroughs in the repo; local admin UI to watch records fill in.

Try it

Clone the framework, bootstrap a reference network, and run the two-step query protocol.

bash
git clone https://github.com/myceliumdata/mycelium.git
cd mycelium && uv sync --all-extras
cp .env.example .env   # OPENAI_API_KEY required
./bin/refresh-example-network crm-seeded --yes
uv run mycelium query --network crm-seeded \
  --lookup-json '{"name":"Nichanan Kesonpat","employer":"1k(x)"}'
# copy delivery_id from JSON, then:
uv run mycelium query --network crm-seeded --delivery-id d_…

New contributors: onboarding example walkthroughs architecture

Example Command Story
crm-seeded ./bin/refresh-example-network crm-seeded --yes 15-person bootstrap → entities.json; fuzzy lookup + contact research
crm-empty ./bin/refresh-example-network crm-empty --yes No seed — empty registry until first query bind creates the row
crm-metering ./bin/refresh-example-network crm-metering --yes Priced research negotiation (quote_required → pay_quote → deliver)
baseball ./bin/refresh-example-network baseball --yes Lahman warehouse (~3–4 min); player + team; derive + provenance; gate 34/34

Custom network: network create <name> --root <path> [--seed <file>]--seed is optional; an empty registry plus first-query bind is valid.

How a query works

From network bootstrap to cached specialist results — same protocol across CRM and baseball.

1

Refresh or create a network

Copy an example (./bin/refresh-example-network) or run network create with optional --seed. crm-empty needs no fixture — first bind writes entities.json. Baseball bootstraps a full Lahman warehouse (~3–4 min).

2

Connect an agent (MCP)

One server entry per network. Call describe_network at connect time — response includes guide.md, ontology categories, record types, and framework usage policy.

3

Two-step query

Step 1: lookup JSON (or registry id) → delivery_id. Step 2: delivery_id → assembled results. Supervisor classifies fields, routes to specialists, runs research or derive on cache miss.

crm-seeded

Lookup Andrea Kalmans → step 2 requests email → contact specialist researches once → cached on repeat.

crm-empty

First bind for Paul Murphy @ Acme Corp creates the registry row. The network grows from zero without bootstrap people.

baseball

Lookup Hank Aaroncareer_hr from warehouse manifest or derive-on-miss → provenance shows computation lineage.

For agents (MCP)

Built for LLM client integrators. One long-lived MCP server per network; call describe_network at connect time. query_entity uses the two-step protocol — lookup-only, no caller ingest payload.

describe_networkquery_entityhealth_check

query_entity — step 1 (resolve)

json
{
  "lookup": {"employer": "645 Ventures"},
  "requested_attributes": ["email"],
  "thread_id": "optional-session-id"
}

query_entity — step 2 (deliver)

json
{
  "delivery_id": "d_abc123"
}

Step 2 response snippet

json
{
  "outcome": "assembled",
  "suggestions": [],
  "results": [
    {
      "id": "3fe6db14-a41d-50fe-9959-c5263dc5f53b",
      "name": "Andrea Kalmans",
      "employer": "Lontra Ventures",
      "email": "[email protected]"
    }
  ],
  "message": "Found record for Andrea Kalmans; assembled from registry and specialist contributions."
}

Branch on outcome after each step. Step 1: lookup_resolved, lookup_incomplete, lookup_suggested, quote_required. Step 2: found, assembled, not_found.

MCP setup + Claude Desktop snippet: docs/examples/getting-started

Under the hood

Supervisor routing, specialist research and derive, warehouse-backed stats, query-only public surface.

Entity registry and multi-record-type routing — canonical JSON identity

Registry + routing

Canonical entities.json (UUID + bind keys) — not SQLite. Supervisor routes by record type (person, player, team) and MVR lookup fields.

Specialist collectives: research, warehouse stats, and derive-on-miss

Specialist collectives

Generated agents per category. Research on miss (LLM + search); warehouse manifest stats; derive-on-miss with computation-centric provenance.

Structured agent API: two-step QueryResponse with outcomes and metering

Structured agent API

MCP and CLI return QueryResponse with outcome, delivery, suggestions, results, and message. Metering negotiation on crm-metering demo network.

Query graph (simplified)

CLI or MCP supervisor resolve / deliver invoke_specialists assemble_response
entities.json warehouse tables derive on miss
Direction: not all shipped yet

Vision

Today's data sources are rigid, manually maintained, and lag behind the speed of agentic AI systems.

Mycelium aims to change that.

Agents bind and enrich entities; operators steer via network guides and validation rules. CRM and baseball examples prove the pattern across person networks and warehouse-backed stats. The long-term goal is data infrastructure where AI agents take primary ownership of organization, quality, and evolution.

"Mycelium lets AIs discover, structure, update, and serve data in real time, with operators setting policy instead of hand-curating every field."

Future directions

  • More reference networks beyond CRM and baseball (e.g. fleet, agronomic via network create)
  • Schema evolution guided by operator ontology and validation rules
  • Inter-network handoff between scoped ecosystems

What we don't claim yet

  • · Turnkey networks for every domain without operator setup
  • · Public data-ingest API (returns as internal agent coordination later)
  • · Inter-network routing
  • · Fully autonomous schema evolution without operator ontology

Join

Open source under MIT. Clone, bootstrap a network, and query.

Star on GitHub →