Skip to main content

LEANN Framework

The institutional memory of the DPO2U ecosystem.

To manage long-term contexts across 2,000+ knowledge documents, DPO2U relies on the LEANN Framework — a semantic search and vector indexing system. LEANN integrates deeply with the 0_Zettelkasten filesystem to enable natural language queries, autonomous map-of-content generation, and real-time context retrieval for all agents.

Index statistics

MetricValue
Indexed documents3,628
Embedding modelall-MiniLM-L6-v2 (sentence-transformers)
Vector dimensions384
BackendHNSW (Hierarchical Navigable Small World) — compact + recompute
Chunk size200 tokens, 50 token overlap
Index namedpo2u-knowledge
ScopeAll .md files in /root/DPO2U/ (excluding obsidian-vault/, leann-env/)
Automatic reindexing

LEANN reindexes automatically every 15 minutes via cron. When any agent edits MEMORY.md, a PostToolUse hook creates a flag (06-Memory/.leann-reindex-needed) that triggers the next scheduled reindex. Manual reindexing is available via python3 build_memory_index.py --force.

What LEANN indexes

The dpo2u-knowledge index covers the full breadth of the DPO2U knowledge base:

Content typeLocationCountPurpose
Zettelkasten notes0_Zettelkasten/Permanent_Notes/2,055+Atomic concepts, permanent notes, literature notes
Maps of Content0_Zettelkasten/Index/136+Curated navigation hubs for topic clusters
Bridges0_Zettelkasten/Index/Bridges/42Interdisciplinary connections (3+ domains)
Concept maps0_Zettelkasten/Concept_Maps/50+Visual and textual concept relationships
Strategic docs00-META/5+Whitepaper, PRDs, design docs
Milestones06-Memory/milestones/30+Project progress markers
Agent/skill definitions.claude/agents/, .claude/skills/50+Agent configs and skill instructions

Search capabilities

LEANN exposes search functionality through an MCP Server registered as leann-server in the Claude Code environment. Any agent can query it using natural language:

leann_search("blockchain governance compliance", index="dpo2u-knowledge", top_k=10)

MCP tools

ToolPurpose
leann_listList all available indexes and their metadata
leann_searchSemantic search with configurable top_k and complexity

Tuning search results

  • top_k — number of results (default 5, use 10-20 for comprehensive exploration)
  • complexity — search depth (16-32 for fast searches, 64+ for high precision)
  • show_metadata — include file paths and metadata in results

The processing pipeline

When a new document enters the system (e.g., a philosophy book, a technical paper, a compliance regulation), LEANN orchestrates a processing pipeline:

Deployed skills

The pipeline repurposes core global agents — no custom agents needed:

  • extract-philosophy — maps text into structured JSON schemas (arguments, quotes, connections) using LEANN for context-aware extraction
  • content-creator — transforms JSON into interconnected Markdown files based on Zettelkasten templates (atomic notes, literature notes, MOCs)
  • file-organizer — maintains system hygiene, creates automated indexes, and updates MOC links

Configuration profile

The model configuration uses local JSON profiles:

{
"profile_name": "philosophy_and_compliance",
"llm_config": {
"provider": "openai-compat",
"model": "flash-model",
"temperature": 0.3
}
}

How agents use LEANN

Every agent follows a mandatory search-before-responding protocol: before answering any question about the project, infrastructure, past decisions, or vault knowledge, the agent must execute leann_search and use the results as context.

Examples:

  • "How does the self-funding cycle work?" → search "self-funding cycle treasury swap" → return answer with context from milestones and contract docs
  • "What is the Teoria Protópica?" → search "teoria protopica verificabilidade" → return synthesis from concept maps and permanent notes
  • Creating a new note → search for similar concepts first to avoid duplicates and establish cross-references

This protocol ensures that agent responses are grounded in the actual knowledge base rather than hallucinated from training data.

What's next

  • The Brain — the 6 interconnected networks that LEANN powers
  • Agents and contracts — how agents use LEANN for compliance verification
  • About DPO2U — the protocol architecture where LEANN fits as the knowledge backbone