25 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	LLMemory Implementation Plan
Current Status: Phase 0 - Planning Complete
This document tracks implementation progress and provides step-by-step guidance for building LLMemory.
Phase 1: MVP (Simple LIKE Search)
Goal: Working CLI tool with basic search in 2-3 days
Status: Not Started
Trigger to Complete: All checkpoints passed, can store/search memories
Step 1.1: Project Setup
Effort: 30 minutes
Status: Not Started
cd llmemory
npm init -y
npm install better-sqlite3 commander chalk date-fns
npm install -D vitest typescript @types/node @types/better-sqlite3
Deliverables:
- package.json configured with dependencies
 - TypeScript configured (optional but recommended)
 - Git initialized with .gitignore
 - bin/memory executable created
 
Checkpoint: Run npm list - all dependencies installed
Step 1.2: Database Layer - Schema & Connection
Effort: 2 hours
Status: Not Started
Files to create:
src/db/connection.js- Database connection and initializationsrc/db/schema.js- Phase 1 schema (memories, tags, memory_tags)src/db/queries.js- Prepared statements
Schema (Phase 1):
CREATE TABLE memories (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  content TEXT NOT NULL CHECK(length(content) <= 10000),
  created_at INTEGER NOT NULL DEFAULT (strftime('%s', 'now')),
  entered_by TEXT,
  expires_at INTEGER,
  CHECK(expires_at IS NULL OR expires_at > created_at)
);
CREATE TABLE tags (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  name TEXT NOT NULL UNIQUE COLLATE NOCASE,
  created_at INTEGER NOT NULL DEFAULT (strftime('%s', 'now'))
);
CREATE TABLE memory_tags (
  memory_id INTEGER NOT NULL,
  tag_id INTEGER NOT NULL,
  PRIMARY KEY (memory_id, tag_id),
  FOREIGN KEY (memory_id) REFERENCES memories(id) ON DELETE CASCADE,
  FOREIGN KEY (tag_id) REFERENCES tags(id) ON DELETE CASCADE
);
CREATE TABLE metadata (
  key TEXT PRIMARY KEY,
  value TEXT NOT NULL
);
CREATE INDEX idx_memories_created ON memories(created_at DESC);
CREATE INDEX idx_memories_expires ON memories(expires_at) WHERE expires_at IS NOT NULL;
CREATE INDEX idx_tags_name ON tags(name);
CREATE INDEX idx_memory_tags_tag ON memory_tags(tag_id);
Implementation checklist:
- Database connection with WAL mode enabled
 - Schema creation on first run
 - Metadata table initialized (schema_version: 1)
 - Prepared statements for common operations
 - Transaction helpers
 
Checkpoint: Run test insertion and query - works without errors
Step 1.3: Core Command - Store
Effort: 2 hours
Status: Not Started
TDD Workflow:
- Write test first (see test structure below)
 - Run test - watch it fail
 - Implement feature - make test pass
 - Refine - improve based on test output
 
Files to create:
test/integration.test.js(TEST FIRST)src/commands/store.jssrc/utils/validation.jssrc/utils/tags.js
Test First (write this before implementation):
// test/integration.test.js
import { describe, test, expect, beforeEach } from 'vitest';
import Database from 'better-sqlite3';
import { storeMemory } from '../src/commands/store.js';
describe('Store Command', () => {
  let db;
  
  beforeEach(() => {
    db = new Database(':memory:');
    // Init schema
    initSchema(db);
  });
  
  test('stores memory with tags', () => {
    const result = storeMemory(db, {
      content: 'Docker uses bridge networks by default',
      tags: 'docker,networking',
      entered_by: 'test'
    });
    
    expect(result.id).toBeDefined();
    
    // Verify in database
    const memory = db.prepare('SELECT * FROM memories WHERE id = ?').get(result.id);
    expect(memory.content).toBe('Docker uses bridge networks by default');
    
    // Verify tags
    const tags = db.prepare(`
      SELECT t.name FROM tags t
      JOIN memory_tags mt ON t.id = mt.tag_id
      WHERE mt.memory_id = ?
    `).all(result.id);
    
    expect(tags.map(t => t.name)).toEqual(['docker', 'networking']);
  });
  
  test('rejects content over 10KB', () => {
    const longContent = 'x'.repeat(10001);
    
    expect(() => {
      storeMemory(db, { content: longContent });
    }).toThrow('Content exceeds 10KB limit');
  });
  
  test('normalizes tags to lowercase', () => {
    storeMemory(db, { content: 'test', tags: 'Docker,NETWORKING' });
    
    const tags = db.prepare('SELECT name FROM tags').all();
    expect(tags).toEqual([
      { name: 'docker' },
      { name: 'networking' }
    ]);
  });
});
Then implement (after test fails):
// src/commands/store.js
export function storeMemory(db, { content, tags, expires, entered_by }) {
  // Implementation goes here
  // Make the test pass!
}
Features checklist:
- Test written first and failing
 - Content validation (length, non-empty)
 - Tag parsing and normalization (lowercase)
 - Expiration date parsing (ISO 8601)
 - Atomic transaction (memory + tags)
 - Test passes
 
Checkpoint: npm test passes for store command
Step 1.4: Core Command - Search (LIKE)
Effort: 3 hours
Status: Not Started
TDD Workflow:
- Write integration test first with realistic data
 - Run and watch it fail
 - Implement search - make test pass
 - Verify manually with CLI
 
Files to create:
- Add tests to 
test/integration.test.js(TEST FIRST) src/commands/search.jssrc/search/like.jssrc/utils/formatting.js
Test First:
// test/integration.test.js (add to existing file)
describe('Search Command', () => {
  let db;
  
  beforeEach(() => {
    db = new Database(':memory:');
    initSchema(db);
    
    // Seed with realistic data
    storeMemory(db, { content: 'Docker uses bridge networks by default', tags: 'docker,networking' });
    storeMemory(db, { content: 'Kubernetes pods share network namespace', tags: 'kubernetes,networking' });
    storeMemory(db, { content: 'PostgreSQL requires explicit vacuum', tags: 'postgresql,database' });
  });
  
  test('finds memories by content', () => {
    const results = searchMemories(db, 'docker');
    
    expect(results).toHaveLength(1);
    expect(results[0].content).toContain('Docker');
  });
  
  test('filters by tags (AND logic)', () => {
    const results = searchMemories(db, 'network', { tags: ['networking'] });
    
    expect(results).toHaveLength(2);
    expect(results.map(r => r.content)).toContain('Docker uses bridge networks by default');
    expect(results.map(r => r.content)).toContain('Kubernetes pods share network namespace');
  });
  
  test('excludes expired memories automatically', () => {
    storeMemory(db, {
      content: 'Expired memory',
      tags: 'test',
      expires_at: Date.now() - 86400 // Yesterday
    });
    
    const results = searchMemories(db, 'expired');
    
    expect(results).toHaveLength(0);
  });
  
  test('respects limit option', () => {
    // Add 20 memories
    for (let i = 0; i < 20; i++) {
      storeMemory(db, { content: `Memory ${i}`, tags: 'test' });
    }
    
    const results = searchMemories(db, 'Memory', { limit: 5 });
    
    expect(results).toHaveLength(5);
  });
});
Then implement to make tests pass.
Features checklist:
- Tests written and failing
 - Case-insensitive LIKE search
 - Tag filtering (AND logic)
 - Date filtering (after/before)
 - Agent filtering (entered_by)
 - Automatic expiration filtering
 - Result limit
 - Tests pass
 
Checkpoint: npm test passes for search, manual CLI test works
Step 1.5: Core Command - List
Effort: 1 hour
Status: Not Started
Files to create:
src/commands/list.js
Implementation:
// Pseudo-code
export async function listCommand(options) {
  // 1. Query memories with filters
  // 2. Order by created_at DESC (or custom sort)
  // 3. Apply limit/offset
  // 4. Format and display
}
Features:
- Sort options (created, expires, content)
 - Order direction (asc/desc)
 - Tag filtering
 - Pagination (limit/offset)
 - Display with tags
 
Checkpoint:
memory list --limit 5
# Should show 5 most recent memories
Step 1.6: Core Command - Prune
Effort: 1.5 hours
Status: Not Started
Files to create:
src/commands/prune.js
Implementation:
// Pseudo-code
export async function pruneCommand(options) {
  // 1. Find expired memories
  // 2. If --dry-run, show what would be deleted
  // 3. Else, prompt for confirmation (unless --force)
  // 4. Delete expired memories
  // 5. Show count of deleted memories
}
Features:
- Find expired memories (expires_at <= now)
 - --dry-run flag (show without deleting)
 - --force flag (skip confirmation)
 - Confirmation prompt
 - Report deleted count
 
Checkpoint:
memory store "Temp" --expires "2020-01-01"
memory prune --dry-run
# Should show the expired memory
memory prune --force
# Should delete it
Step 1.7: CLI Integration
Effort: 2 hours
Status: Not Started
Files to create:
src/cli.jsbin/memory
Implementation:
// src/cli.js
import { Command } from 'commander';
const program = new Command();
program
  .name('memory')
  .description('AI Agent Memory System')
  .version('1.0.0');
program
  .command('store <content>')
  .description('Store a new memory')
  .option('-t, --tags <tags>', 'Comma-separated tags')
  .option('-e, --expires <date>', 'Expiration date')
  .option('--by <agent>', 'Agent name')
  .action(storeCommand);
program
  .command('search <query>')
  .description('Search memories')
  .option('-t, --tags <tags>', 'Filter by tags')
  .option('--after <date>', 'Created after')
  .option('--before <date>', 'Created before')
  .option('--entered-by <agent>', 'Filter by agent')
  .option('-l, --limit <n>', 'Max results', '10')
  .action(searchCommand);
// ... other commands
program.parse();
Features:
- All commands registered
 - Global options (--db, --verbose, --json)
 - Help text for all commands
 - Error handling
 - Exit codes (0=success, 1=error)
 
Checkpoint:
memory --help
# Should show all commands
memory store --help
# Should show store options
Step 1.8: Testing & Polish
Effort: 2 hours
Status: Not Started
Note: Integration tests written first for each feature (TDD approach).
This step is for final polish and comprehensive scenarios.
Files to enhance:
test/integration.test.js(should already have tests from Steps 1.3-1.6)test/helpers/seed.js- Realistic data generationtest/fixtures/realistic-memories.js- Memory templates
Comprehensive test scenarios:
- Full workflow: store → search → list → prune
 - Performance: 100 memories, search <50ms
 - Edge cases: empty query, no results, expired memories
 - Data validation: content length, invalid dates, malformed tags
 - Tag normalization: uppercase → lowercase, duplicates
 - Expiration: auto-filter in search, prune removes correctly
 
Checkpoint: All tests pass with npm test, >80% coverage (mostly integration)
Phase 1 Completion Criteria
- All checkpoints passed
 - Can store memories with tags and expiration
 - Can search with basic LIKE matching
 - Can list recent memories
 - Can prune expired memories
 - Help text comprehensive
 - Tests passing (>80% coverage)
 - Database file created at ~/.config/opencode/memories.db
 
Validation test:
# Full workflow test
memory store "Docker Compose uses bridge networks by default" --tags docker,networking
memory store "Kubernetes pods share network namespace" --tags kubernetes,networking
memory search "networking" --tags docker
# Should return only Docker memory
memory list --limit 10
# Should show both memories
memory stats
# Should show 2 memories, 2 unique tags
Phase 2: FTS5 Migration
Goal: Production-grade search with FTS5
Status: Not Started
Trigger to Start: Dataset > 500 memories OR query latency > 500ms OR manual request
Step 2.1: Migration Script
Effort: 2 hours
Status: Not Started
Files to create:
src/db/migrations.jssrc/db/migrations/002_fts5.js
Implementation:
export async function migrateToFTS5(db) {
  console.log('Migrating to FTS5...');
  
  // 1. Check if already migrated
  const version = db.prepare('SELECT value FROM metadata WHERE key = ?').get('schema_version');
  if (version.value >= 2) {
    console.log('Already on FTS5');
    return;
  }
  
  // 2. Create FTS5 table
  db.exec(`CREATE VIRTUAL TABLE memories_fts USING fts5(...)`);
  
  // 3. Populate from existing memories
  db.exec(`INSERT INTO memories_fts(rowid, content) SELECT id, content FROM memories`);
  
  // 4. Create triggers
  db.exec(`CREATE TRIGGER memories_ai AFTER INSERT...`);
  db.exec(`CREATE TRIGGER memories_ad AFTER DELETE...`);
  db.exec(`CREATE TRIGGER memories_au AFTER UPDATE...`);
  
  // 5. Update schema version
  db.prepare('UPDATE metadata SET value = ? WHERE key = ?').run('2', 'schema_version');
  
  console.log('Migration complete!');
}
Checkpoint: Run migration on test DB, verify FTS5 table exists and is populated
Step 2.2: FTS5 Search Implementation
Effort: 3 hours
Status: Not Started
Files to create:
src/search/fts.js
Features:
- FTS5 MATCH query builder
 - Support boolean operators (AND/OR/NOT)
 - Phrase queries ("exact phrase")
 - Prefix matching (docker*)
 - BM25 relevance ranking
 - Combined with metadata filters
 
Checkpoint: FTS5 search returns results ranked by relevance
Step 2.3: CLI Command - Migrate
Effort: 1 hour
Status: Not Started
Files to create:
src/commands/migrate.js
Implementation:
memory migrate fts5
# Prompts for confirmation, runs migration
Checkpoint: Command successfully migrates Phase 1 DB to Phase 2
Phase 3: Fuzzy Layer
Goal: Handle typos and inexact matches
Status: Not Started
Trigger to Start: Manual request or need for fuzzy matching
Step 3.1: Trigram Infrastructure
Effort: 3 hours
Status: Not Started
Files to create:
src/db/migrations/003_trigrams.jssrc/search/fuzzy.js
Features:
- Trigram table creation
 - Trigram extraction function
 - Populate trigrams from existing memories
 - Trigger to maintain trigrams on insert/update
 
Step 3.2: Fuzzy Search Implementation
Effort: 4 hours
Status: Not Started
Features:
- Trigram similarity calculation
 - Levenshtein distance implementation
 - Combined relevance scoring
 - Cascade logic (exact → fuzzy)
 - Configurable threshold
 
Step 3.3: CLI Integration
Effort: 2 hours
Status: Not Started
Features:
- --fuzzy flag for search command
 - --threshold option
 - Auto-fuzzy when <5 results
 
Additional Features (Post-MVP)
Stats Command
Effort: 2 hours
Status: Not Started
memory stats
# Total memories: 1,234
# Total tags: 56
# Database size: 2.3 MB
# Most used tags: docker (123), kubernetes (89), nodejs (67)
memory stats --tags
# docker: 123
# kubernetes: 89
# nodejs: 67
# ...
memory stats --agents
# investigate-agent: 456
# optimize-agent: 234
# manual: 544
Export/Import Commands
Effort: 3 hours
Status: Not Started
memory export memories.json
# Exported 1,234 memories to memories.json
memory import memories.json
# Imported 1,234 memories
Agent Context Documentation
Effort: 3 hours
Status: Not Started
Files to create:
docs/AGENT_GUIDE.mdsrc/commands/agent-context.js
memory --agent-context
# Displays comprehensive guide for AI agents
Auto-Extraction (Remember Pattern)
Effort: 4 hours
Status: Not Started
Files to create:
src/extractors/remember.js
Features:
- Regex pattern to detect 
*Remember*: [fact] - Auto-extract tags from content
 - Auto-detect expiration dates
 - Store extracted memories
 - Report extraction results
 
OpenCode Plugin Integration
Effort: 3 hours
Status: Not Started
Files to create:
plugin.js(root level for OpenCode)
Features:
- Plugin registration
 - API exposure (store, search, extractRemember)
 - Lifecycle hooks (onInstall, onUninstall)
 - Command registration
 
Testing Strategy
TDD Philosophy: Integration-First Approach
Core Principles:
- Integration tests are primary - Test real workflows end-to-end
 - Unit tests are rare - Only for complex algorithms (fuzzy matching, trigrams, Levenshtein)
 - Test with real data - Use SQLite :memory: or temp files with realistic scenarios
 - Watch-driven development - Run tests in watch mode, see failures, implement, see success
 
Testing Workflow:
# 1. Write integration test first (it will fail)
npm run test:watch
# 2. Run program manually to see behavior
node src/cli.js store "test"
# 3. Implement feature
# 4. Watch tests pass
# 5. Refine based on output
Integration Tests (Primary)
Coverage target: All major workflows
Test approach:
- Use real SQLite database (
:memory:for speed, temp file for persistence tests) - Simulate realistic data (10-100 memories per test)
 - Test actual CLI commands via Node API
 - Verify end-to-end behavior, not internal implementation
 
Test scenarios:
// test/integration.test.js
describe('Memory System Integration', () => {
  test('store and retrieve workflow', async () => {
    // Store memory
    await cli(['store', 'Docker uses bridge networks', '--tags', 'docker,networking']);
    
    // Search for it
    const results = await cli(['search', 'docker']);
    
    // Verify output
    expect(results).toContain('Docker uses bridge networks');
    expect(results).toContain('docker');
    expect(results).toContain('networking');
  });
  
  test('realistic dataset search performance', async () => {
    // Insert 100 realistic memories
    for (let i = 0; i < 100; i++) {
      await storeMemory(generateRealisticMemory());
    }
    
    // Search should be fast
    const start = Date.now();
    await cli(['search', 'docker']);
    const duration = Date.now() - start;
    
    expect(duration).toBeLessThan(50); // Phase 1 target
  });
});
Test data generation:
// test/fixtures/realistic-memories.js
export function generateRealisticMemory() {
  const templates = [
    { content: 'Docker Compose requires explicit subnet config when using multiple networks', tags: ['docker', 'networking'] },
    { content: 'PostgreSQL VACUUM FULL locks tables, use ANALYZE instead', tags: ['postgresql', 'performance'] },
    { content: 'Git worktree allows parallel branches without stashing', tags: ['git', 'workflow'] },
    // ... 50+ realistic templates
  ];
  return randomChoice(templates);
}
Unit Tests (Rare - Only When Necessary)
When to write unit tests:
- Complex algorithms with edge cases (Levenshtein distance, trigram extraction)
 - Pure functions with clear inputs/outputs
 - Critical validation logic
 
When NOT to write unit tests:
- Database queries (covered by integration tests)
 - CLI parsing (covered by integration tests)
 - Simple utilities (tag parsing, date formatting)
 
Example unit test (justified):
// test/unit/fuzzy.test.js - Complex algorithm worth unit testing
describe('Levenshtein distance', () => {
  test('calculates edit distance correctly', () => {
    expect(levenshtein('docker', 'dcoker')).toBe(2);
    expect(levenshtein('kubernetes', 'kuberntes')).toBe(2);
    expect(levenshtein('same', 'same')).toBe(0);
  });
  
  test('handles edge cases', () => {
    expect(levenshtein('', 'hello')).toBe(5);
    expect(levenshtein('a', '')).toBe(1);
  });
});
Test Data Management
For integration tests:
// Use :memory: database for fast, isolated tests
beforeEach(() => {
  db = new Database(':memory:');
  initSchema(db);
});
// Or use temp file for persistence testing
import { mkdtempSync } from 'fs';
import { join } from 'path';
import { tmpdir } from 'os';
beforeEach(() => {
  const tempDir = mkdtempSync(join(tmpdir(), 'llmemory-test-'));
  dbPath = join(tempDir, 'test.db');
  db = new Database(dbPath);
  initSchema(db);
});
afterEach(() => {
  db.close();
  // Cleanup temp files
});
Realistic data seeding:
// test/helpers/seed.js
export async function seedDatabase(db, count = 50) {
  const memories = [];
  
  for (let i = 0; i < count; i++) {
    memories.push({
      content: generateRealisticMemory(),
      tags: generateRealisticTags(),
      entered_by: randomChoice(['investigate-agent', 'optimize-agent', 'manual']),
      created_at: Date.now() - randomInt(0, 90 * 86400) // Random within 90 days
    });
  }
  
  // Bulk insert
  const insert = db.transaction((memories) => {
    for (const memory of memories) {
      storeMemory(db, memory);
    }
  });
  
  insert(memories);
  return memories;
}
Performance Tests
Run after each phase:
// Benchmark search latency
test('Phase 1 search <50ms for 500 memories', async () => {
  // Insert 500 test memories
  const start = Date.now();
  const results = await search('test query');
  const duration = Date.now() - start;
  expect(duration).toBeLessThan(50);
});
test('Phase 2 search <100ms for 10K memories', async () => {
  // Insert 10K test memories
  const start = Date.now();
  const results = await search('test query');
  const duration = Date.now() - start;
  expect(duration).toBeLessThan(100);
});
Documentation Roadmap
Phase 1 Docs
- README.md - Quick start, installation, basic usage
 - CLI_REFERENCE.md - All commands and options
 - ARCHITECTURE.md - System design, schema, algorithms
 
Phase 2 Docs
- AGENT_GUIDE.md - Comprehensive guide for AI agents
 - MIGRATION_GUIDE.md - Phase 1 → 2 → 3 instructions
 - QUERY_SYNTAX.md - FTS5 query patterns
 
Phase 3 Docs
- API.md - Programmatic API for plugins
 - CONTRIBUTING.md - Development setup, testing
 - TROUBLESHOOTING.md - Common issues and solutions
 
Success Metrics
Phase 1 (MVP)
- Can store/retrieve memories
 - Search works for exact matches
 - Performance: <50ms for 500 memories
 - Test coverage: >80%
 - No critical bugs
 
Phase 2 (FTS5)
- Migration completes without data loss
 - Search quality improved (relevance ranking)
 - Performance: <100ms for 10K memories
 - Boolean operators work correctly
 
Phase 3 (Fuzzy)
- Typos correctly matched (edit distance ≤2)
 - Fuzzy cascade improves result count
 - Performance: <200ms for 10K memories
 - No false positives (threshold tuned)
 
Overall
- Agents use system regularly in workflows
 - Search results are high-quality (relevant)
 - Token-efficient (limited, ranked results)
 - No performance complaints
 - Documentation comprehensive
 
Development Workflow
Daily Checklist
- Pull latest changes
 - Run tests: 
npm test - Work on current step
 - Write/update tests
 - Update this document (mark checkboxes)
 - Commit with clear message
 - Update CHANGELOG.md
 
Before Phase Completion
- All checkpoints passed
 - Tests passing (>80% coverage)
 - Documentation updated
 - Performance benchmarks run
 - Manual testing completed
 - Changelog updated
 
Commit Message Format
<type>(<scope>): <subject>
Examples:
feat(search): implement FTS5 search with BM25 ranking
fix(store): validate content length before insertion
docs(readme): add installation instructions
test(search): add integration tests for filters
refactor(db): extract connection logic to separate file
Troubleshooting
Common Issues
Issue: SQLite FTS5 not available
Solution: Ensure SQLite version ≥3.35, check better-sqlite3 includes FTS5
Issue: Database locked errors
Solution: Enable WAL mode: PRAGMA journal_mode = WAL
Issue: Slow searches with large dataset
Solution: Check indexes exist, run ANALYZE, consider migration to next phase
Issue: Tag filtering not working Solution: Verify tag normalization (lowercase), check many-to-many joins
Next Session Continuation
For the next developer/AI agent:
- Check Current Phase: Review checkboxes in this file to see progress
 - Run Tests: 
npm testto verify current state - Check Database: 
sqlite3 ~/.config/opencode/memories.db .schemato see current schema version - Review SPECIFICATION.md: Understand overall architecture
 - Pick Next Step: Find first unchecked item in current phase
 - Update This File: Mark completed checkboxes as you go
 
Quick Start Commands:
cd llmemory
npm install              # Install dependencies
npm test                 # Run test suite
npm run start -- --help  # Test CLI
Current Status: Phase 0 complete (planning/documentation), ready to begin Phase 1 implementation.
Estimated Time to MVP: 12-15 hours of focused development.
Resources
- SQLite FTS5: https://www.sqlite.org/fts5.html
 - better-sqlite3: https://github.com/WiseLibs/better-sqlite3
 - Commander.js: https://github.com/tj/commander.js
 - Vitest: https://vitest.dev/
 
Changelog
2025-10-29 - Phase 0 Complete
- Project structure defined
 - Comprehensive specification written
 - Implementation plan created
 - Agent investigation reports integrated
 - Ready for Phase 1 development