LEANN Framework
The institutional memory of the DPO2U ecosystem.
To manage long-term contexts across 2,000+ knowledge documents, DPO2U relies on the LEANN Framework — a semantic search and vector indexing system. LEANN integrates deeply with the 0_Zettelkasten filesystem to enable natural language queries, autonomous map-of-content generation, and real-time context retrieval for all agents.
Index statistics
| Metric | Value |
|---|---|
| Indexed documents | 3,628 |
| Embedding model | all-MiniLM-L6-v2 (sentence-transformers) |
| Vector dimensions | 384 |
| Backend | HNSW (Hierarchical Navigable Small World) — compact + recompute |
| Chunk size | 200 tokens, 50 token overlap |
| Index name | dpo2u-knowledge |
| Scope | All .md files in /root/DPO2U/ (excluding obsidian-vault/, leann-env/) |
LEANN reindexes automatically every 15 minutes via cron. When any agent edits MEMORY.md, a PostToolUse hook creates a flag (06-Memory/.leann-reindex-needed) that triggers the next scheduled reindex. Manual reindexing is available via python3 build_memory_index.py --force.
What LEANN indexes
The dpo2u-knowledge index covers the full breadth of the DPO2U knowledge base:
| Content type | Location | Count | Purpose |
|---|---|---|---|
| Zettelkasten notes | 0_Zettelkasten/Permanent_Notes/ | 2,055+ | Atomic concepts, permanent notes, literature notes |
| Maps of Content | 0_Zettelkasten/Index/ | 136+ | Curated navigation hubs for topic clusters |
| Bridges | 0_Zettelkasten/Index/Bridges/ | 42 | Interdisciplinary connections (3+ domains) |
| Concept maps | 0_Zettelkasten/Concept_Maps/ | 50+ | Visual and textual concept relationships |
| Strategic docs | 00-META/ | 5+ | Whitepaper, PRDs, design docs |
| Milestones | 06-Memory/milestones/ | 30+ | Project progress markers |
| Agent/skill definitions | .claude/agents/, .claude/skills/ | 50+ | Agent configs and skill instructions |
Search capabilities
LEANN exposes search functionality through an MCP Server registered as leann-server in the Claude Code environment. Any agent can query it using natural language:
leann_search("blockchain governance compliance", index="dpo2u-knowledge", top_k=10)
MCP tools
| Tool | Purpose |
|---|---|
leann_list | List all available indexes and their metadata |
leann_search | Semantic search with configurable top_k and complexity |
Tuning search results
top_k— number of results (default 5, use 10-20 for comprehensive exploration)complexity— search depth (16-32 for fast searches, 64+ for high precision)show_metadata— include file paths and metadata in results
The processing pipeline
When a new document enters the system (e.g., a philosophy book, a technical paper, a compliance regulation), LEANN orchestrates a processing pipeline:
Deployed skills
The pipeline repurposes core global agents — no custom agents needed:
extract-philosophy— maps text into structured JSON schemas (arguments, quotes, connections) using LEANN for context-aware extractioncontent-creator— transforms JSON into interconnected Markdown files based on Zettelkasten templates (atomic notes, literature notes, MOCs)file-organizer— maintains system hygiene, creates automated indexes, and updates MOC links
Configuration profile
The model configuration uses local JSON profiles:
{
"profile_name": "philosophy_and_compliance",
"llm_config": {
"provider": "openai-compat",
"model": "flash-model",
"temperature": 0.3
}
}
How agents use LEANN
Every agent follows a mandatory search-before-responding protocol: before answering any question about the project, infrastructure, past decisions, or vault knowledge, the agent must execute leann_search and use the results as context.
Examples:
- "How does the self-funding cycle work?" → search
"self-funding cycle treasury swap"→ return answer with context from milestones and contract docs - "What is the Teoria Protópica?" → search
"teoria protopica verificabilidade"→ return synthesis from concept maps and permanent notes - Creating a new note → search for similar concepts first to avoid duplicates and establish cross-references
This protocol ensures that agent responses are grounded in the actual knowledge base rather than hallucinated from training data.
What's next
- The Brain — the 6 interconnected networks that LEANN powers
- Agents and contracts — how agents use LEANN for compliance verification
- About DPO2U — the protocol architecture where LEANN fits as the knowledge backbone