Memory System
Memory
Section titled “Memory”OpenClaw memory is plain Markdown in the agent workspace. The files are the source of truth; the model only “remembers” what gets written to disk.
Memory search tools are provided by the active memory plugin (default: memory-core). Disable memory plugins with plugins.slots.memory = "none".
Memory files (Markdown)
Section titled “Memory files (Markdown)”The default workspace layout uses two memory layers:
memory/YYYY-MM-DD.md- Daily log (append-only).
- Read today + yesterday at session start.
MEMORY.md(optional)- Curated long-term memory.
- Only load in the main, private session (never in group contexts).
These files live under the workspace (agents.defaults.workspace, default ~/.openclaw/workspace). See Agent workspace for the full layout.
When to write memory
Section titled “When to write memory”- Decisions, preferences, and durable facts go to
MEMORY.md. - Day-to-day notes and running context go to
memory/YYYY-MM-DD.md. - If someone says “remember this,” write it down (do not keep it in RAM).
- This area is still evolving. It helps to remind the model to store memories; it will know what to do.
- If you want something to stick, ask the bot to write it into memory.
Automatic memory flush (pre-compaction ping)
Section titled “Automatic memory flush (pre-compaction ping)”When a session is close to auto-compaction, OpenClaw triggers a silent, agentic turn that reminds the model to write durable memory before the context is compacted. The default prompts explicitly say the model may reply, but usually NO_REPLY is the correct response so the user never sees this turn.
This is controlled by agents.defaults.compaction.memoryFlush:
{ agents: { defaults: { compaction: { reserveTokensFloor: 20000, memoryFlush: { enabled: true, softThresholdTokens: 4000, systemPrompt: "Session nearing compaction. Store durable memories now.", prompt: "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store." } } } }}Details:
- Soft threshold: flush triggers when the session token estimate crosses
contextWindow - reserveTokensFloor - softThresholdTokens. - Silent by default: prompts include
NO_REPLYso nothing is delivered. - Two prompts: a user prompt plus a system prompt append the reminder.
- One flush per compaction cycle (tracked in
sessions.json). - Workspace must be writable: if the session runs sandboxed with
workspaceAccess: "ro"or"none", the flush is skipped.
For the full compaction lifecycle, see Session management + compaction.
Vector memory search
Section titled “Vector memory search”OpenClaw can build a small vector index over MEMORY.md and memory/*.md (plus any extra directories or files you opt in) so semantic queries can find related notes even when wording differs.
Defaults:
- Enabled by default.
- Watches memory files for changes (debounced).
- Uses remote embeddings by default. If
memorySearch.provideris not set, OpenClaw auto-selects:localif amemorySearch.local.modelPathis configured and the file exists.openaiif an OpenAI key can be resolved.geminiif a Gemini key can be resolved.- Otherwise memory search stays disabled until configured.
- Local mode uses node-llama-cpp and may require
pnpm approve-builds. - Uses sqlite-vec (when available) to accelerate vector search inside SQLite.
Remote embeddings require an API key for the embedding provider. OpenClaw resolves keys from auth profiles, models.providers.*.apiKey, or environment variables. Codex OAuth only covers chat/completions and does not satisfy embeddings for memory search. For Gemini, use GEMINI_API_KEY or models.providers.google.apiKey. When using a custom OpenAI-compatible endpoint, set memorySearch.remote.apiKey (and optional memorySearch.remote.headers).
Additional memory paths
Section titled “Additional memory paths”If you want to index Markdown files outside the default workspace layout, add explicit paths:
agents: { defaults: { memorySearch: { extraPaths: ["../team-docs", "/srv/shared-notes/overview.md"] } }}Notes:
- Paths can be absolute or workspace-relative.
- Directories are scanned recursively for
.mdfiles. - Only Markdown files are indexed.
- Symlinks are ignored (files or directories).
Gemini embeddings (native)
Section titled “Gemini embeddings (native)”Set the provider to gemini to use the Gemini embeddings API directly:
agents: { defaults: { memorySearch: { provider: "gemini", model: "gemini-embedding-001", remote: { apiKey: "YOUR_GEMINI_API_KEY" } } }}Notes:
remote.baseUrlis optional (defaults to the Gemini API base URL).remote.headerslets you add extra headers if needed.- Default model:
gemini-embedding-001.
If you want to use a custom OpenAI-compatible endpoint (OpenRouter, vLLM, or a proxy), you can use the remote configuration with the OpenAI provider:
agents: { defaults: { memorySearch: { provider: "openai", model: "text-embedding-3-small", remote: { baseUrl: "https://api.example.com/v1/", apiKey: "YOUR_OPENAI_COMPAT_API_KEY", headers: { "X-Custom-Header": "value" } } } }}If you don’t want to set an API key, use memorySearch.provider = "local" or set memorySearch.fallback = "none".
Fallbacks:
memorySearch.fallbackcan beopenai,gemini,local, ornone.- The fallback provider is only used when the primary embedding provider fails.
Batch indexing (OpenAI + Gemini):
- Enabled by default for OpenAI and Gemini embeddings. Set
agents.defaults.memorySearch.remote.batch.enabled = falseto disable. - Default behavior waits for batch completion; tune
remote.batch.wait,remote.batch.pollIntervalMs, andremote.batch.timeoutMinutesif needed. - Set
remote.batch.concurrencyto control how many batch jobs we submit in parallel (default: 2). - Batch mode applies when
memorySearch.provider = "openai"or"gemini"and uses the corresponding API key. - Gemini batch jobs use the async embeddings batch endpoint and require Gemini Batch API availability.
Why OpenAI batch is fast + cheap:
- For large backfills, OpenAI is typically the fastest option we support because we can submit many embedding requests in a single batch job and let OpenAI process them asynchronously.
- OpenAI offers discounted pricing for Batch API workloads, so large indexing runs are usually cheaper than sending the same requests synchronously.
- See the OpenAI Batch API docs and pricing for details:
Config example:
agents: { defaults: { memorySearch: { provider: "openai", model: "text-embedding-3-small", fallback: "openai", remote: { batch: { enabled: true, concurrency: 2 } }, sync: { watch: true } } }}Tools:
memory_search— returns snippets with file + line ranges.memory_get— read memory file content by path.
Local mode:
- Set
agents.defaults.memorySearch.provider = "local". - Provide
agents.defaults.memorySearch.local.modelPath(GGUF orhf:URI). - Optional: set
agents.defaults.memorySearch.fallback = "none"to avoid remote fallback.
How the memory tools work
Section titled “How the memory tools work”memory_searchsemantically searches Markdown chunks (~400 token target, 80-token overlap) fromMEMORY.md+memory/**/*.md. It returns snippet text (capped ~700 chars), file path, line range, score, provider/model, and whether we fell back from local → remote embeddings. No full file payload is returned.memory_getreads a specific memory Markdown file (workspace-relative), optionally from a starting line and for N lines. Paths outsideMEMORY.md/memory/are allowed only when explicitly listed inmemorySearch.extraPaths.- Both tools are enabled only when
memorySearch.enabledresolves true for the agent.
What gets indexed (and when)
Section titled “What gets indexed (and when)”- File type: Markdown only (
MEMORY.md,memory/**/*.md, plus any.mdfiles undermemorySearch.extraPaths). - Index storage: per-agent SQLite at
~/.openclaw/memory/<agentId>.sqlite(configurable viaagents.defaults.memorySearch.store.path, supports{agentId}token). - Freshness: watcher on
MEMORY.md,memory/, andmemorySearch.extraPathsmarks the index dirty (debounce 1.5s). Sync is scheduled on session start, on search, or on an interval and runs asynchronously. Session transcripts use delta thresholds to trigger background sync. - Reindex triggers: the index stores the embedding provider/model + endpoint fingerprint + chunking params. If any of those change, OpenClaw automatically resets and reindexes the entire store.
Hybrid search (BM25 + vector)
Section titled “Hybrid search (BM25 + vector)”When enabled, OpenClaw combines:
- Vector similarity (semantic match, wording can differ)
- BM25 keyword relevance (exact tokens like IDs, env vars, code symbols)
If full-text search is unavailable on your platform, OpenClaw falls back to vector-only search.
Why hybrid?
Section titled “Why hybrid?”Vector search is great at “this means the same thing”:
- “Mac Studio gateway host” vs “the machine running the gateway”
- “debounce file updates” vs “avoid indexing on every write”
But it can be weak at exact, high-signal tokens:
- IDs (
a828e60,b3b9895a…) - code symbols (
memorySearch.query.hybrid) - error strings (“sqlite-vec unavailable”)
BM25 (full-text) is the opposite: strong at exact tokens, weaker at paraphrases. Hybrid search is the pragmatic middle ground: use both retrieval signals so you get good results for both “natural language” queries and “needle in a haystack” queries.
How we merge results (the current design)
Section titled “How we merge results (the current design)”Implementation sketch:
- Retrieve a candidate pool from both sides:
- Vector: top
maxResults * candidateMultiplierby cosine similarity. - BM25: top
maxResults * candidateMultiplierby FTS5 BM25 rank (lower is better).
- Convert BM25 rank into a 0..1-ish score:
textScore = 1 / (1 + max(0, bm25Rank))
- Union candidates by chunk id and compute a weighted score:
finalScore = vectorWeight * vectorScore + textWeight * textScore
Notes:
vectorWeight+textWeightis normalized to 1.0 in config resolution, so weights behave as percentages.- If embeddings are unavailable (or the provider returns a zero-vector), we still run BM25 and return keyword matches.
- If FTS5 can’t be created, we keep vector-only search (no hard failure).
This isn’t “IR-theory perfect”, but it’s simple, fast, and tends to improve recall/precision on real notes. If we want to get fancier later, common next steps are Reciprocal Rank Fusion (RRF) or score normalization (min/max or z-score) before mixing.
Config:
agents: { defaults: { memorySearch: { query: { hybrid: { enabled: true, vectorWeight: 0.7, textWeight: 0.3, candidateMultiplier: 4 } } } }}Embedding cache
Section titled “Embedding cache”OpenClaw can cache chunk embeddings in SQLite so reindexing and frequent updates (especially session transcripts) don’t re-embed unchanged text.
Config:
agents: { defaults: { memorySearch: { cache: { enabled: true, maxEntries: 50000 } } }}Session memory search (experimental)
Section titled “Session memory search (experimental)”You can optionally index session transcripts and surface them via memory_search. This is gated behind an experimental flag.
agents: { defaults: { memorySearch: { experimental: { sessionMemory: true }, sources: ["memory", "sessions"] } }}Notes:
- Session indexing is opt-in (off by default).
- Session updates are debounced and indexed asynchronously once they cross delta thresholds (best-effort).
memory_searchnever blocks on indexing; results can be slightly stale until background sync finishes.- Results still include snippets only;
memory_getremains limited to memory files. - Session indexing is isolated per agent (only that agent’s session logs are indexed).
- Session logs live on disk (
~/.openclaw/agents/<agentId>/sessions/*.jsonl). Any process/user with filesystem access can read them, so treat disk access as the trust boundary. For stricter isolation, run agents under separate OS users or hosts.
Delta thresholds (defaults shown):
agents: { defaults: { memorySearch: { sync: { sessions: { deltaBytes: 100000, // ~100 KB deltaMessages: 50 // JSONL lines } } } }}SQLite vector acceleration (sqlite-vec)
Section titled “SQLite vector acceleration (sqlite-vec)”When the sqlite-vec extension is available, OpenClaw stores embeddings in a SQLite virtual table (vec0) and performs vector distance queries in the database. This keeps search fast without loading every embedding into JS.
Configuration (optional):
agents: { defaults: { memorySearch: { store: { vector: { enabled: true, extensionPath: "/path/to/sqlite-vec" } } } }}Notes:
enableddefaults to true; when disabled, search falls back to in-process cosine similarity over stored embeddings.- If the sqlite-vec extension is missing or fails to load, OpenClaw logs the error and continues with the JS fallback (no vector table).
extensionPathoverrides the bundled sqlite-vec path (useful for custom builds or non-standard install locations).
Local embedding auto-download
Section titled “Local embedding auto-download”- Default local embedding model:
hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf(~0.6 GB). - When
memorySearch.provider = "local",node-llama-cppresolvesmodelPath; if the GGUF is missing it auto-downloads to the cache (orlocal.modelCacheDirif set), then loads it. Downloads resume on retry. - Native build requirement: run
pnpm approve-builds, picknode-llama-cpp, thenpnpm rebuild node-llama-cpp. - Fallback: if local setup fails and
memorySearch.fallback = "openai", we automatically switch to remote embeddings (openai/text-embedding-3-smallunless overridden) and record the reason.
Custom OpenAI-compatible endpoint example
Section titled “Custom OpenAI-compatible endpoint example”agents: { defaults: { memorySearch: { provider: "openai", model: "text-embedding-3-small", remote: { baseUrl: "https://api.example.com/v1/", apiKey: "YOUR_REMOTE_API_KEY", headers: { "X-Organization": "org-id", "X-Project": "project-id" } } } }}Notes:
remote.*takes precedence overmodels.providers.openai.*.remote.headersmerge with OpenAI headers; remote wins on key conflicts. Omitremote.headersto use the OpenAI defaults.