# LLMemory Implementation Plan ## Current Status: Phase 0 - Planning Complete This document tracks implementation progress and provides step-by-step guidance for building LLMemory. ## Phase 1: MVP (Simple LIKE Search) **Goal:** Working CLI tool with basic search in 2-3 days **Status:** Not Started **Trigger to Complete:** All checkpoints passed, can store/search memories ### Step 1.1: Project Setup **Effort:** 30 minutes **Status:** Not Started ```bash cd llmemory npm init -y npm install better-sqlite3 commander chalk date-fns npm install -D vitest typescript @types/node @types/better-sqlite3 ``` **Deliverables:** - [ ] package.json configured with dependencies - [ ] TypeScript configured (optional but recommended) - [ ] Git initialized with .gitignore - [ ] bin/memory executable created **Checkpoint:** Run `npm list` - all dependencies installed --- ### Step 1.2: Database Layer - Schema & Connection **Effort:** 2 hours **Status:** Not Started **Files to create:** - `src/db/connection.js` - Database connection and initialization - `src/db/schema.js` - Phase 1 schema (memories, tags, memory_tags) - `src/db/queries.js` - Prepared statements **Schema (Phase 1):** ```sql CREATE TABLE memories ( id INTEGER PRIMARY KEY AUTOINCREMENT, content TEXT NOT NULL CHECK(length(content) <= 10000), created_at INTEGER NOT NULL DEFAULT (strftime('%s', 'now')), entered_by TEXT, expires_at INTEGER, CHECK(expires_at IS NULL OR expires_at > created_at) ); CREATE TABLE tags ( id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT NOT NULL UNIQUE COLLATE NOCASE, created_at INTEGER NOT NULL DEFAULT (strftime('%s', 'now')) ); CREATE TABLE memory_tags ( memory_id INTEGER NOT NULL, tag_id INTEGER NOT NULL, PRIMARY KEY (memory_id, tag_id), FOREIGN KEY (memory_id) REFERENCES memories(id) ON DELETE CASCADE, FOREIGN KEY (tag_id) REFERENCES tags(id) ON DELETE CASCADE ); CREATE TABLE metadata ( key TEXT PRIMARY KEY, value TEXT NOT NULL ); CREATE INDEX idx_memories_created ON memories(created_at DESC); CREATE INDEX idx_memories_expires ON memories(expires_at) WHERE expires_at IS NOT NULL; CREATE INDEX idx_tags_name ON tags(name); CREATE INDEX idx_memory_tags_tag ON memory_tags(tag_id); ``` **Implementation checklist:** - [ ] Database connection with WAL mode enabled - [ ] Schema creation on first run - [ ] Metadata table initialized (schema_version: 1) - [ ] Prepared statements for common operations - [ ] Transaction helpers **Checkpoint:** Run test insertion and query - works without errors --- ### Step 1.3: Core Command - Store **Effort:** 2 hours **Status:** Not Started **TDD Workflow:** 1. **Write test first** (see test structure below) 2. **Run test** - watch it fail 3. **Implement feature** - make test pass 4. **Refine** - improve based on test output **Files to create:** - `test/integration.test.js` (TEST FIRST) - `src/commands/store.js` - `src/utils/validation.js` - `src/utils/tags.js` **Test First (write this before implementation):** ```javascript // test/integration.test.js import { describe, test, expect, beforeEach } from 'vitest'; import Database from 'better-sqlite3'; import { storeMemory } from '../src/commands/store.js'; describe('Store Command', () => { let db; beforeEach(() => { db = new Database(':memory:'); // Init schema initSchema(db); }); test('stores memory with tags', () => { const result = storeMemory(db, { content: 'Docker uses bridge networks by default', tags: 'docker,networking', entered_by: 'test' }); expect(result.id).toBeDefined(); // Verify in database const memory = db.prepare('SELECT * FROM memories WHERE id = ?').get(result.id); expect(memory.content).toBe('Docker uses bridge networks by default'); // Verify tags const tags = db.prepare(` SELECT t.name FROM tags t JOIN memory_tags mt ON t.id = mt.tag_id WHERE mt.memory_id = ? `).all(result.id); expect(tags.map(t => t.name)).toEqual(['docker', 'networking']); }); test('rejects content over 10KB', () => { const longContent = 'x'.repeat(10001); expect(() => { storeMemory(db, { content: longContent }); }).toThrow('Content exceeds 10KB limit'); }); test('normalizes tags to lowercase', () => { storeMemory(db, { content: 'test', tags: 'Docker,NETWORKING' }); const tags = db.prepare('SELECT name FROM tags').all(); expect(tags).toEqual([ { name: 'docker' }, { name: 'networking' } ]); }); }); ``` **Then implement** (after test fails): ```javascript // src/commands/store.js export function storeMemory(db, { content, tags, expires, entered_by }) { // Implementation goes here // Make the test pass! } ``` **Features checklist:** - [ ] Test written first and failing - [ ] Content validation (length, non-empty) - [ ] Tag parsing and normalization (lowercase) - [ ] Expiration date parsing (ISO 8601) - [ ] Atomic transaction (memory + tags) - [ ] Test passes **Checkpoint:** `npm test` passes for store command --- ### Step 1.4: Core Command - Search (LIKE) **Effort:** 3 hours **Status:** Not Started **TDD Workflow:** 1. **Write integration test first** with realistic data 2. **Run and watch it fail** 3. **Implement search** - make test pass 4. **Verify manually** with CLI **Files to create:** - Add tests to `test/integration.test.js` (TEST FIRST) - `src/commands/search.js` - `src/search/like.js` - `src/utils/formatting.js` **Test First:** ```javascript // test/integration.test.js (add to existing file) describe('Search Command', () => { let db; beforeEach(() => { db = new Database(':memory:'); initSchema(db); // Seed with realistic data storeMemory(db, { content: 'Docker uses bridge networks by default', tags: 'docker,networking' }); storeMemory(db, { content: 'Kubernetes pods share network namespace', tags: 'kubernetes,networking' }); storeMemory(db, { content: 'PostgreSQL requires explicit vacuum', tags: 'postgresql,database' }); }); test('finds memories by content', () => { const results = searchMemories(db, 'docker'); expect(results).toHaveLength(1); expect(results[0].content).toContain('Docker'); }); test('filters by tags (AND logic)', () => { const results = searchMemories(db, 'network', { tags: ['networking'] }); expect(results).toHaveLength(2); expect(results.map(r => r.content)).toContain('Docker uses bridge networks by default'); expect(results.map(r => r.content)).toContain('Kubernetes pods share network namespace'); }); test('excludes expired memories automatically', () => { storeMemory(db, { content: 'Expired memory', tags: 'test', expires_at: Date.now() - 86400 // Yesterday }); const results = searchMemories(db, 'expired'); expect(results).toHaveLength(0); }); test('respects limit option', () => { // Add 20 memories for (let i = 0; i < 20; i++) { storeMemory(db, { content: `Memory ${i}`, tags: 'test' }); } const results = searchMemories(db, 'Memory', { limit: 5 }); expect(results).toHaveLength(5); }); }); ``` **Then implement** to make tests pass. **Features checklist:** - [ ] Tests written and failing - [ ] Case-insensitive LIKE search - [ ] Tag filtering (AND logic) - [ ] Date filtering (after/before) - [ ] Agent filtering (entered_by) - [ ] Automatic expiration filtering - [ ] Result limit - [ ] Tests pass **Checkpoint:** `npm test` passes for search, manual CLI test works --- ### Step 1.5: Core Command - List **Effort:** 1 hour **Status:** Not Started **Files to create:** - `src/commands/list.js` **Implementation:** ```javascript // Pseudo-code export async function listCommand(options) { // 1. Query memories with filters // 2. Order by created_at DESC (or custom sort) // 3. Apply limit/offset // 4. Format and display } ``` **Features:** - [ ] Sort options (created, expires, content) - [ ] Order direction (asc/desc) - [ ] Tag filtering - [ ] Pagination (limit/offset) - [ ] Display with tags **Checkpoint:** ```bash memory list --limit 5 # Should show 5 most recent memories ``` --- ### Step 1.6: Core Command - Prune **Effort:** 1.5 hours **Status:** Not Started **Files to create:** - `src/commands/prune.js` **Implementation:** ```javascript // Pseudo-code export async function pruneCommand(options) { // 1. Find expired memories // 2. If --dry-run, show what would be deleted // 3. Else, prompt for confirmation (unless --force) // 4. Delete expired memories // 5. Show count of deleted memories } ``` **Features:** - [ ] Find expired memories (expires_at <= now) - [ ] --dry-run flag (show without deleting) - [ ] --force flag (skip confirmation) - [ ] Confirmation prompt - [ ] Report deleted count **Checkpoint:** ```bash memory store "Temp" --expires "2020-01-01" memory prune --dry-run # Should show the expired memory memory prune --force # Should delete it ``` --- ### Step 1.7: CLI Integration **Effort:** 2 hours **Status:** Not Started **Files to create:** - `src/cli.js` - `bin/memory` **Implementation:** ```javascript // src/cli.js import { Command } from 'commander'; const program = new Command(); program .name('memory') .description('AI Agent Memory System') .version('1.0.0'); program .command('store ') .description('Store a new memory') .option('-t, --tags ', 'Comma-separated tags') .option('-e, --expires ', 'Expiration date') .option('--by ', 'Agent name') .action(storeCommand); program .command('search ') .description('Search memories') .option('-t, --tags ', 'Filter by tags') .option('--after ', 'Created after') .option('--before ', 'Created before') .option('--entered-by ', 'Filter by agent') .option('-l, --limit ', 'Max results', '10') .action(searchCommand); // ... other commands program.parse(); ``` **Features:** - [ ] All commands registered - [ ] Global options (--db, --verbose, --json) - [ ] Help text for all commands - [ ] Error handling - [ ] Exit codes (0=success, 1=error) **Checkpoint:** ```bash memory --help # Should show all commands memory store --help # Should show store options ``` --- ### Step 1.8: Testing & Polish **Effort:** 2 hours **Status:** Not Started **Note:** Integration tests written first for each feature (TDD approach). This step is for final polish and comprehensive scenarios. **Files to enhance:** - `test/integration.test.js` (should already have tests from Steps 1.3-1.6) - `test/helpers/seed.js` - Realistic data generation - `test/fixtures/realistic-memories.js` - Memory templates **Comprehensive test scenarios:** - [ ] Full workflow: store → search → list → prune - [ ] Performance: 100 memories, search <50ms - [ ] Edge cases: empty query, no results, expired memories - [ ] Data validation: content length, invalid dates, malformed tags - [ ] Tag normalization: uppercase → lowercase, duplicates - [ ] Expiration: auto-filter in search, prune removes correctly **Checkpoint:** All tests pass with `npm test`, >80% coverage (mostly integration) --- ## Phase 1 Completion Criteria - [ ] All checkpoints passed - [ ] Can store memories with tags and expiration - [ ] Can search with basic LIKE matching - [ ] Can list recent memories - [ ] Can prune expired memories - [ ] Help text comprehensive - [ ] Tests passing (>80% coverage) - [ ] Database file created at ~/.config/opencode/memories.db **Validation test:** ```bash # Full workflow test memory store "Docker Compose uses bridge networks by default" --tags docker,networking memory store "Kubernetes pods share network namespace" --tags kubernetes,networking memory search "networking" --tags docker # Should return only Docker memory memory list --limit 10 # Should show both memories memory stats # Should show 2 memories, 2 unique tags ``` --- ## Phase 2: FTS5 Migration **Goal:** Production-grade search with FTS5 **Status:** Not Started **Trigger to Start:** Dataset > 500 memories OR query latency > 500ms OR manual request ### Step 2.1: Migration Script **Effort:** 2 hours **Status:** Not Started **Files to create:** - `src/db/migrations.js` - `src/db/migrations/002_fts5.js` **Implementation:** ```javascript export async function migrateToFTS5(db) { console.log('Migrating to FTS5...'); // 1. Check if already migrated const version = db.prepare('SELECT value FROM metadata WHERE key = ?').get('schema_version'); if (version.value >= 2) { console.log('Already on FTS5'); return; } // 2. Create FTS5 table db.exec(`CREATE VIRTUAL TABLE memories_fts USING fts5(...)`); // 3. Populate from existing memories db.exec(`INSERT INTO memories_fts(rowid, content) SELECT id, content FROM memories`); // 4. Create triggers db.exec(`CREATE TRIGGER memories_ai AFTER INSERT...`); db.exec(`CREATE TRIGGER memories_ad AFTER DELETE...`); db.exec(`CREATE TRIGGER memories_au AFTER UPDATE...`); // 5. Update schema version db.prepare('UPDATE metadata SET value = ? WHERE key = ?').run('2', 'schema_version'); console.log('Migration complete!'); } ``` **Checkpoint:** Run migration on test DB, verify FTS5 table exists and is populated --- ### Step 2.2: FTS5 Search Implementation **Effort:** 3 hours **Status:** Not Started **Files to create:** - `src/search/fts.js` **Features:** - [ ] FTS5 MATCH query builder - [ ] Support boolean operators (AND/OR/NOT) - [ ] Phrase queries ("exact phrase") - [ ] Prefix matching (docker*) - [ ] BM25 relevance ranking - [ ] Combined with metadata filters **Checkpoint:** FTS5 search returns results ranked by relevance --- ### Step 2.3: CLI Command - Migrate **Effort:** 1 hour **Status:** Not Started **Files to create:** - `src/commands/migrate.js` **Implementation:** ```bash memory migrate fts5 # Prompts for confirmation, runs migration ``` **Checkpoint:** Command successfully migrates Phase 1 DB to Phase 2 --- ## Phase 3: Fuzzy Layer **Goal:** Handle typos and inexact matches **Status:** Not Started **Trigger to Start:** Manual request or need for fuzzy matching ### Step 3.1: Trigram Infrastructure **Effort:** 3 hours **Status:** Not Started **Files to create:** - `src/db/migrations/003_trigrams.js` - `src/search/fuzzy.js` **Features:** - [ ] Trigram table creation - [ ] Trigram extraction function - [ ] Populate trigrams from existing memories - [ ] Trigger to maintain trigrams on insert/update --- ### Step 3.2: Fuzzy Search Implementation **Effort:** 4 hours **Status:** Not Started **Features:** - [ ] Trigram similarity calculation - [ ] Levenshtein distance implementation - [ ] Combined relevance scoring - [ ] Cascade logic (exact → fuzzy) - [ ] Configurable threshold --- ### Step 3.3: CLI Integration **Effort:** 2 hours **Status:** Not Started **Features:** - [ ] --fuzzy flag for search command - [ ] --threshold option - [ ] Auto-fuzzy when <5 results --- ## Additional Features (Post-MVP) ### Stats Command **Effort:** 2 hours **Status:** Not Started ```bash memory stats # Total memories: 1,234 # Total tags: 56 # Database size: 2.3 MB # Most used tags: docker (123), kubernetes (89), nodejs (67) memory stats --tags # docker: 123 # kubernetes: 89 # nodejs: 67 # ... memory stats --agents # investigate-agent: 456 # optimize-agent: 234 # manual: 544 ``` --- ### Export/Import Commands **Effort:** 3 hours **Status:** Not Started ```bash memory export memories.json # Exported 1,234 memories to memories.json memory import memories.json # Imported 1,234 memories ``` --- ### Agent Context Documentation **Effort:** 3 hours **Status:** Not Started **Files to create:** - `docs/AGENT_GUIDE.md` - `src/commands/agent-context.js` ```bash memory --agent-context # Displays comprehensive guide for AI agents ``` --- ### Auto-Extraction (*Remember* Pattern) **Effort:** 4 hours **Status:** Not Started **Files to create:** - `src/extractors/remember.js` **Features:** - [ ] Regex pattern to detect `*Remember*: [fact]` - [ ] Auto-extract tags from content - [ ] Auto-detect expiration dates - [ ] Store extracted memories - [ ] Report extraction results --- ### OpenCode Plugin Integration **Effort:** 3 hours **Status:** Not Started **Files to create:** - `plugin.js` (root level for OpenCode) **Features:** - [ ] Plugin registration - [ ] API exposure (store, search, extractRemember) - [ ] Lifecycle hooks (onInstall, onUninstall) - [ ] Command registration --- ## Testing Strategy ### TDD Philosophy: Integration-First Approach **Core Principles:** 1. **Integration tests are primary** - Test real workflows end-to-end 2. **Unit tests are rare** - Only for complex algorithms (fuzzy matching, trigrams, Levenshtein) 3. **Test with real data** - Use SQLite :memory: or temp files with realistic scenarios 4. **Watch-driven development** - Run tests in watch mode, see failures, implement, see success **Testing Workflow:** ```bash # 1. Write integration test first (it will fail) npm run test:watch # 2. Run program manually to see behavior node src/cli.js store "test" # 3. Implement feature # 4. Watch tests pass # 5. Refine based on output ``` --- ### Integration Tests (Primary) **Coverage target:** All major workflows **Test approach:** - Use real SQLite database (`:memory:` for speed, temp file for persistence tests) - Simulate realistic data (10-100 memories per test) - Test actual CLI commands via Node API - Verify end-to-end behavior, not internal implementation **Test scenarios:** ```javascript // test/integration.test.js describe('Memory System Integration', () => { test('store and retrieve workflow', async () => { // Store memory await cli(['store', 'Docker uses bridge networks', '--tags', 'docker,networking']); // Search for it const results = await cli(['search', 'docker']); // Verify output expect(results).toContain('Docker uses bridge networks'); expect(results).toContain('docker'); expect(results).toContain('networking'); }); test('realistic dataset search performance', async () => { // Insert 100 realistic memories for (let i = 0; i < 100; i++) { await storeMemory(generateRealisticMemory()); } // Search should be fast const start = Date.now(); await cli(['search', 'docker']); const duration = Date.now() - start; expect(duration).toBeLessThan(50); // Phase 1 target }); }); ``` **Test data generation:** ```javascript // test/fixtures/realistic-memories.js export function generateRealisticMemory() { const templates = [ { content: 'Docker Compose requires explicit subnet config when using multiple networks', tags: ['docker', 'networking'] }, { content: 'PostgreSQL VACUUM FULL locks tables, use ANALYZE instead', tags: ['postgresql', 'performance'] }, { content: 'Git worktree allows parallel branches without stashing', tags: ['git', 'workflow'] }, // ... 50+ realistic templates ]; return randomChoice(templates); } ``` --- ### Unit Tests (Rare - Only When Necessary) **When to write unit tests:** - Complex algorithms with edge cases (Levenshtein distance, trigram extraction) - Pure functions with clear inputs/outputs - Critical validation logic **When NOT to write unit tests:** - Database queries (covered by integration tests) - CLI parsing (covered by integration tests) - Simple utilities (tag parsing, date formatting) **Example unit test (justified):** ```javascript // test/unit/fuzzy.test.js - Complex algorithm worth unit testing describe('Levenshtein distance', () => { test('calculates edit distance correctly', () => { expect(levenshtein('docker', 'dcoker')).toBe(2); expect(levenshtein('kubernetes', 'kuberntes')).toBe(2); expect(levenshtein('same', 'same')).toBe(0); }); test('handles edge cases', () => { expect(levenshtein('', 'hello')).toBe(5); expect(levenshtein('a', '')).toBe(1); }); }); ``` --- ### Test Data Management **For integration tests:** ```javascript // Use :memory: database for fast, isolated tests beforeEach(() => { db = new Database(':memory:'); initSchema(db); }); // Or use temp file for persistence testing import { mkdtempSync } from 'fs'; import { join } from 'path'; import { tmpdir } from 'os'; beforeEach(() => { const tempDir = mkdtempSync(join(tmpdir(), 'llmemory-test-')); dbPath = join(tempDir, 'test.db'); db = new Database(dbPath); initSchema(db); }); afterEach(() => { db.close(); // Cleanup temp files }); ``` **Realistic data seeding:** ```javascript // test/helpers/seed.js export async function seedDatabase(db, count = 50) { const memories = []; for (let i = 0; i < count; i++) { memories.push({ content: generateRealisticMemory(), tags: generateRealisticTags(), entered_by: randomChoice(['investigate-agent', 'optimize-agent', 'manual']), created_at: Date.now() - randomInt(0, 90 * 86400) // Random within 90 days }); } // Bulk insert const insert = db.transaction((memories) => { for (const memory of memories) { storeMemory(db, memory); } }); insert(memories); return memories; } ``` --- ### Performance Tests **Run after each phase:** ```javascript // Benchmark search latency test('Phase 1 search <50ms for 500 memories', async () => { // Insert 500 test memories const start = Date.now(); const results = await search('test query'); const duration = Date.now() - start; expect(duration).toBeLessThan(50); }); test('Phase 2 search <100ms for 10K memories', async () => { // Insert 10K test memories const start = Date.now(); const results = await search('test query'); const duration = Date.now() - start; expect(duration).toBeLessThan(100); }); ``` --- ## Documentation Roadmap ### Phase 1 Docs - [ ] README.md - Quick start, installation, basic usage - [ ] CLI_REFERENCE.md - All commands and options - [ ] ARCHITECTURE.md - System design, schema, algorithms ### Phase 2 Docs - [ ] AGENT_GUIDE.md - Comprehensive guide for AI agents - [ ] MIGRATION_GUIDE.md - Phase 1 → 2 → 3 instructions - [ ] QUERY_SYNTAX.md - FTS5 query patterns ### Phase 3 Docs - [ ] API.md - Programmatic API for plugins - [ ] CONTRIBUTING.md - Development setup, testing - [ ] TROUBLESHOOTING.md - Common issues and solutions --- ## Success Metrics ### Phase 1 (MVP) - [ ] Can store/retrieve memories - [ ] Search works for exact matches - [ ] Performance: <50ms for 500 memories - [ ] Test coverage: >80% - [ ] No critical bugs ### Phase 2 (FTS5) - [ ] Migration completes without data loss - [ ] Search quality improved (relevance ranking) - [ ] Performance: <100ms for 10K memories - [ ] Boolean operators work correctly ### Phase 3 (Fuzzy) - [ ] Typos correctly matched (edit distance ≤2) - [ ] Fuzzy cascade improves result count - [ ] Performance: <200ms for 10K memories - [ ] No false positives (threshold tuned) ### Overall - [ ] Agents use system regularly in workflows - [ ] Search results are high-quality (relevant) - [ ] Token-efficient (limited, ranked results) - [ ] No performance complaints - [ ] Documentation comprehensive --- ## Development Workflow ### Daily Checklist 1. Pull latest changes 2. Run tests: `npm test` 3. Work on current step 4. Write/update tests 5. Update this document (mark checkboxes) 6. Commit with clear message 7. Update CHANGELOG.md ### Before Phase Completion 1. All checkpoints passed 2. Tests passing (>80% coverage) 3. Documentation updated 4. Performance benchmarks run 5. Manual testing completed 6. Changelog updated ### Commit Message Format ``` (): Examples: feat(search): implement FTS5 search with BM25 ranking fix(store): validate content length before insertion docs(readme): add installation instructions test(search): add integration tests for filters refactor(db): extract connection logic to separate file ``` --- ## Troubleshooting ### Common Issues **Issue:** SQLite FTS5 not available **Solution:** Ensure SQLite version ≥3.35, check `better-sqlite3` includes FTS5 **Issue:** Database locked errors **Solution:** Enable WAL mode: `PRAGMA journal_mode = WAL` **Issue:** Slow searches with large dataset **Solution:** Check indexes exist, run `ANALYZE`, consider migration to next phase **Issue:** Tag filtering not working **Solution:** Verify tag normalization (lowercase), check many-to-many joins --- ## Next Session Continuation **For the next developer/AI agent:** 1. **Check Current Phase:** Review checkboxes in this file to see progress 2. **Run Tests:** `npm test` to verify current state 3. **Check Database:** `sqlite3 ~/.config/opencode/memories.db .schema` to see current schema version 4. **Review SPECIFICATION.md:** Understand overall architecture 5. **Pick Next Step:** Find first unchecked item in current phase 6. **Update This File:** Mark completed checkboxes as you go **Quick Start Commands:** ```bash cd llmemory npm install # Install dependencies npm test # Run test suite npm run start -- --help # Test CLI ``` **Current Status:** Phase 0 complete (planning/documentation), ready to begin Phase 1 implementation. **Estimated Time to MVP:** 12-15 hours of focused development. --- ## Resources - **SQLite FTS5:** https://www.sqlite.org/fts5.html - **better-sqlite3:** https://github.com/WiseLibs/better-sqlite3 - **Commander.js:** https://github.com/tj/commander.js - **Vitest:** https://vitest.dev/ --- ## Changelog ### 2025-10-29 - Phase 0 Complete - Project structure defined - Comprehensive specification written - Implementation plan created - Agent investigation reports integrated - Ready for Phase 1 development