Files
AGENTS/.sisyphus/notepads/opencode-memory/learnings.md

8.4 KiB

Opencode Memory Plugin — Learnings

Session: ses_3a5a47a05ffeoNYfz2RARYsHX9

Started: 2026-02-14

Architecture Decisions

  • SQLite + FTS5 + vec0 replaces mem0+qdrant entirely
  • Markdown at ~/CODEX/80-memory/ is source of truth
  • SQLite DB at ~/.local/share/opencode-memory/index.db is derived index
  • OpenAI text-embedding-3-small for embeddings (1536 dimensions)
  • Hybrid search: 0.7 vector weight + 0.3 BM25 weight
  • Chunking: 400 tokens, 80 overlap (tiktoken cl100k_base)

Key Patterns from Openclaw

  • MemoryIndexManager pattern (1590 lines) — file watching, chunking, indexing
  • Hybrid scoring with weighted combination
  • Embedding cache by content_hash + model
  • Two sources: "memory" (markdown files) + "sessions" (transcripts)
  • Two tools: memory_search (hybrid query) + memory_get (read lines)

Technical Stack

  • Runtime: bun
  • Test framework: bun test (TDD)
  • SQLite: better-sqlite3 (synchronous API)
  • Embeddings: openai npm package
  • Chunking: tiktoken (cl100k_base encoding)
  • File watching: chokidar
  • Validation: zod (for tool schemas)

Vec0 Extension Findings (Task 1)

  • vec0 extension: NOT AVAILABLE - requires vec0.so shared library not present
  • Alternative solution: sqlite-vec package (v0.1.7-alpha.2) successfully tested
  • Loading mechanism: sqliteVec.load(db) loads vector extension into database
  • Test result: Works with Node.js (better-sqlite3 native module compatible)
  • Note: better-sqlite3 does NOT work with Bun runtime (native module incompatibility)
  • Testing command: node -e "const Database = require('better-sqlite3'); const sqliteVec = require('sqlite-vec'); const db = new Database(':memory:'); sqliteVec.load(db); console.log('OK')"

Bun Runtime Limitations

  • better-sqlite3 native module NOT compatible with Bun (ERR_DLOPEN_FAILED)
  • Use Node.js for any code requiring better-sqlite3
  • Alternative: bun:sqlite API (similar API, but not same library)

Wave Progress

  • Wave 1: IN PROGRESS (Task 1)
  • Wave 2-6: PENDING

Configuration Module Implementation (Task: Config Module)

  • TDD approach: RED-GREEN-REFACTOR cycle successfully applied
  • Pattern: Default config object + resolveConfig() function for merging
  • Path expansion: expandPath() helper function handles ~$HOME expansion
  • Test coverage: 10 tests covering defaults, overrides, path expansion, and config merging
  • TypeScript best practices: Proper type exports from types.ts, type imports in config.ts
  • Defaults match openclaw: chunking (400/80), search weights (0.7/0.3), minScore (0.35), maxResults (6)
  • Bun test framework: Fast execution (~20ms for 10 tests), clean output

Database Schema Implementation (Task 2)

  • TDD approach: RED-GREEN-REFACTOR cycle successfully applied for db module
  • Schema tables: meta, files, chunks, embedding_cache, chunks_fts (FTS5), chunks_vec (vec0)
  • WAL mode: Enabled via db.pragma('journal_mode = WAL') for better concurrency
  • Foreign keys: Enabled via db.pragma('foreign_keys = ON')
  • sqlite-vec integration: Loaded via sqliteVec.load(db) for vector search capabilities
  • FTS5 virtual table: External content table referencing chunks for full-text search
  • vec0 virtual table: 1536-dimension float array for OpenAI text-embedding-3-small embeddings
  • Test execution: Use Node.js with tsx for TypeScript execution (not Bun runtime)
  • Buffer handling: Float32Array must be converted to Buffer via Buffer.from(array.buffer) for SQLite binding
  • In-memory databases: WAL mode returns 'memory' for :memory: DBs, 'wal' for file-based DBs
  • Test coverage: 9 tests covering table creation, data insertion, FTS5, vec0, WAL mode, and clean closure
  • Error handling: better-sqlite3 throws "The database connection is not open" for operations on closed DBs

Node.js Test Execution

  • Issue: better-sqlite3 not compatible with Bun runtime (native module)
  • Solution: Use Node.js with tsx (TypeScript executor) for running tests
  • Command: npx tsx --test src/__tests__/db.test.ts
  • Node.test API: Uses describe, it, before, after from 'node:test' module
  • Assertions: Use assert from 'node:assert' module
  • Cleanup: Use after() hooks for database cleanup, not afterEach() (node:test difference)

Embedding Provider Implementation (Task: Embeddings Module)

  • TDD approach: RED-GREEN-REFACTOR cycle successfully applied for embeddings module
  • Mock database: Created in-memory mock for testing since better-sqlite3 incompatible with Bun
  • Float32 precision: embeddings stored/retrieved via Float32Array has limited precision (use toBeCloseTo in tests)
  • Cache implementation: content_hash + model composite key in embedding_cache table
  • Retry logic: Exponential backoff (1s, 2s, 4s) for 429/500 errors, max 3 retries
  • Test coverage: 11 tests covering embed(), embedBatch(), cache hits/misses, API failures, retries, buffer conversion
  • Helper functions: embeddingToBuffer() and bufferToEmbedding() for Float32Array ↔ Buffer conversion
  • Bun spyOn: Use mockClear() to reset call count without replacing mock implementation
  • Buffer size: Float32 embedding stored as Buffer with size = dimensions * 4 bytes

FTS5 BM25 Search Implementation (Task: FTS5 Search Module)

  • TDD approach: RED-GREEN-REFACTOR cycle successfully applied for search module
  • buildFtsQuery(): Extracts alphanumeric tokens via regex /[A-Za-z0-9_]+/g, quotes them, joins with AND
  • FTS5 escaping: Tokens are quoted to handle special characters (e.g., "term")
  • BM25 score normalization: bm25RankToScore(rank) converts BM25 rank to 0-1 score using 1 / (1 + normalized)
  • FTS5 external content tables: The schema uses content='chunks', content_rowid='rowid' but requires manual insertion into chunks_fts
  • Test data setup: Must manually insert into chunks_fts after inserting into chunks (external content doesn't auto-populate)
  • BM25 ranking: Results are ordered by rank column (lower rank = better match for FTS5)
  • Error handling: searchFTS catches SQL errors and returns empty array (graceful degradation)
  • MaxResults parameter: Respects LIMIT clause in SQL query
  • SearchResult interface: Includes id, filePath, startLine, endLine, text, contentHash, source, score (all required)
  • Prefix matching: FTS5 supports prefix queries automatically via token matching (e.g., "test" matches "testing")
  • No matches: Returns empty array when query has no valid tokens or no matches found
  • Test coverage: 7 tests covering basic search, exact keywords, partial words, no matches, ranking, maxResults, and metadata

Hybrid Search Implementation (Task: Hybrid Search Combiner)

  • TDD approach: RED-GREEN-REFACTOR cycle successfully applied for hybrid search
  • Weighted scoring: Combined score = vectorWeight * vectorScore + textWeight * textScore (default: 0.7/0.3)
  • Result merging: Uses Map<string, HybridSearchResult> to merge results by chunk ID, preventing duplicates
  • Dual-score tracking: Each result tracks both vectorScore and textScore separately, allowing for degraded modes
  • Graceful degradation: Works with FTS5-only (vector search fails) or vector-only (FTS5 fails)
  • minScore filtering: Results below minScore threshold are filtered out after score calculation
  • Score sorting: Results sorted by combined score in descending order before applying maxResults limit
  • Vector search fallback: searchVector catches errors and returns empty array, allowing FTS5-only operation
  • FTS5 query fallback: searchFTS catches SQL errors and returns empty array, allowing vector-only operation
  • Database cleanup: beforeEach must delete from chunks_fts, chunks_vec, chunks, and files to avoid state bleed
  • Virtual table corruption: Deleting from FTS5/vec0 virtual tables can cause corruption - use try/catch to recreate
  • SearchResult type conflict: SearchResult is imported from types.ts, don't re-export in search.ts
  • Test isolation: Virtual tables (chunks_fts, chunks_vec) must be cleared and potentially recreated between tests
  • Buffer conversion: queryEmbedding converted to Buffer via Buffer.from(new Float32Array(array).buffer)
  • Debug logging: process.env.DEBUG_SEARCH flag enables detailed logging of FTS5 and vector search results
  • Test coverage: 9 tests covering combination, weighting, minScore filtering, deduplication, sorting, maxResults, degraded modes (FTS5-only, vector-only), and custom weights