25 KiB
LLMemory Implementation Plan
Current Status: Phase 0 - Planning Complete
This document tracks implementation progress and provides step-by-step guidance for building LLMemory.
Phase 1: MVP (Simple LIKE Search)
Goal: Working CLI tool with basic search in 2-3 days
Status: Not Started
Trigger to Complete: All checkpoints passed, can store/search memories
Step 1.1: Project Setup
Effort: 30 minutes
Status: Not Started
cd llmemory
npm init -y
npm install better-sqlite3 commander chalk date-fns
npm install -D vitest typescript @types/node @types/better-sqlite3
Deliverables:
- package.json configured with dependencies
- TypeScript configured (optional but recommended)
- Git initialized with .gitignore
- bin/memory executable created
Checkpoint: Run npm list - all dependencies installed
Step 1.2: Database Layer - Schema & Connection
Effort: 2 hours
Status: Not Started
Files to create:
src/db/connection.js- Database connection and initializationsrc/db/schema.js- Phase 1 schema (memories, tags, memory_tags)src/db/queries.js- Prepared statements
Schema (Phase 1):
CREATE TABLE memories (
id INTEGER PRIMARY KEY AUTOINCREMENT,
content TEXT NOT NULL CHECK(length(content) <= 10000),
created_at INTEGER NOT NULL DEFAULT (strftime('%s', 'now')),
entered_by TEXT,
expires_at INTEGER,
CHECK(expires_at IS NULL OR expires_at > created_at)
);
CREATE TABLE tags (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL UNIQUE COLLATE NOCASE,
created_at INTEGER NOT NULL DEFAULT (strftime('%s', 'now'))
);
CREATE TABLE memory_tags (
memory_id INTEGER NOT NULL,
tag_id INTEGER NOT NULL,
PRIMARY KEY (memory_id, tag_id),
FOREIGN KEY (memory_id) REFERENCES memories(id) ON DELETE CASCADE,
FOREIGN KEY (tag_id) REFERENCES tags(id) ON DELETE CASCADE
);
CREATE TABLE metadata (
key TEXT PRIMARY KEY,
value TEXT NOT NULL
);
CREATE INDEX idx_memories_created ON memories(created_at DESC);
CREATE INDEX idx_memories_expires ON memories(expires_at) WHERE expires_at IS NOT NULL;
CREATE INDEX idx_tags_name ON tags(name);
CREATE INDEX idx_memory_tags_tag ON memory_tags(tag_id);
Implementation checklist:
- Database connection with WAL mode enabled
- Schema creation on first run
- Metadata table initialized (schema_version: 1)
- Prepared statements for common operations
- Transaction helpers
Checkpoint: Run test insertion and query - works without errors
Step 1.3: Core Command - Store
Effort: 2 hours
Status: Not Started
TDD Workflow:
- Write test first (see test structure below)
- Run test - watch it fail
- Implement feature - make test pass
- Refine - improve based on test output
Files to create:
test/integration.test.js(TEST FIRST)src/commands/store.jssrc/utils/validation.jssrc/utils/tags.js
Test First (write this before implementation):
// test/integration.test.js
import { describe, test, expect, beforeEach } from 'vitest';
import Database from 'better-sqlite3';
import { storeMemory } from '../src/commands/store.js';
describe('Store Command', () => {
let db;
beforeEach(() => {
db = new Database(':memory:');
// Init schema
initSchema(db);
});
test('stores memory with tags', () => {
const result = storeMemory(db, {
content: 'Docker uses bridge networks by default',
tags: 'docker,networking',
entered_by: 'test'
});
expect(result.id).toBeDefined();
// Verify in database
const memory = db.prepare('SELECT * FROM memories WHERE id = ?').get(result.id);
expect(memory.content).toBe('Docker uses bridge networks by default');
// Verify tags
const tags = db.prepare(`
SELECT t.name FROM tags t
JOIN memory_tags mt ON t.id = mt.tag_id
WHERE mt.memory_id = ?
`).all(result.id);
expect(tags.map(t => t.name)).toEqual(['docker', 'networking']);
});
test('rejects content over 10KB', () => {
const longContent = 'x'.repeat(10001);
expect(() => {
storeMemory(db, { content: longContent });
}).toThrow('Content exceeds 10KB limit');
});
test('normalizes tags to lowercase', () => {
storeMemory(db, { content: 'test', tags: 'Docker,NETWORKING' });
const tags = db.prepare('SELECT name FROM tags').all();
expect(tags).toEqual([
{ name: 'docker' },
{ name: 'networking' }
]);
});
});
Then implement (after test fails):
// src/commands/store.js
export function storeMemory(db, { content, tags, expires, entered_by }) {
// Implementation goes here
// Make the test pass!
}
Features checklist:
- Test written first and failing
- Content validation (length, non-empty)
- Tag parsing and normalization (lowercase)
- Expiration date parsing (ISO 8601)
- Atomic transaction (memory + tags)
- Test passes
Checkpoint: npm test passes for store command
Step 1.4: Core Command - Search (LIKE)
Effort: 3 hours
Status: Not Started
TDD Workflow:
- Write integration test first with realistic data
- Run and watch it fail
- Implement search - make test pass
- Verify manually with CLI
Files to create:
- Add tests to
test/integration.test.js(TEST FIRST) src/commands/search.jssrc/search/like.jssrc/utils/formatting.js
Test First:
// test/integration.test.js (add to existing file)
describe('Search Command', () => {
let db;
beforeEach(() => {
db = new Database(':memory:');
initSchema(db);
// Seed with realistic data
storeMemory(db, { content: 'Docker uses bridge networks by default', tags: 'docker,networking' });
storeMemory(db, { content: 'Kubernetes pods share network namespace', tags: 'kubernetes,networking' });
storeMemory(db, { content: 'PostgreSQL requires explicit vacuum', tags: 'postgresql,database' });
});
test('finds memories by content', () => {
const results = searchMemories(db, 'docker');
expect(results).toHaveLength(1);
expect(results[0].content).toContain('Docker');
});
test('filters by tags (AND logic)', () => {
const results = searchMemories(db, 'network', { tags: ['networking'] });
expect(results).toHaveLength(2);
expect(results.map(r => r.content)).toContain('Docker uses bridge networks by default');
expect(results.map(r => r.content)).toContain('Kubernetes pods share network namespace');
});
test('excludes expired memories automatically', () => {
storeMemory(db, {
content: 'Expired memory',
tags: 'test',
expires_at: Date.now() - 86400 // Yesterday
});
const results = searchMemories(db, 'expired');
expect(results).toHaveLength(0);
});
test('respects limit option', () => {
// Add 20 memories
for (let i = 0; i < 20; i++) {
storeMemory(db, { content: `Memory ${i}`, tags: 'test' });
}
const results = searchMemories(db, 'Memory', { limit: 5 });
expect(results).toHaveLength(5);
});
});
Then implement to make tests pass.
Features checklist:
- Tests written and failing
- Case-insensitive LIKE search
- Tag filtering (AND logic)
- Date filtering (after/before)
- Agent filtering (entered_by)
- Automatic expiration filtering
- Result limit
- Tests pass
Checkpoint: npm test passes for search, manual CLI test works
Step 1.5: Core Command - List
Effort: 1 hour
Status: Not Started
Files to create:
src/commands/list.js
Implementation:
// Pseudo-code
export async function listCommand(options) {
// 1. Query memories with filters
// 2. Order by created_at DESC (or custom sort)
// 3. Apply limit/offset
// 4. Format and display
}
Features:
- Sort options (created, expires, content)
- Order direction (asc/desc)
- Tag filtering
- Pagination (limit/offset)
- Display with tags
Checkpoint:
memory list --limit 5
# Should show 5 most recent memories
Step 1.6: Core Command - Prune
Effort: 1.5 hours
Status: Not Started
Files to create:
src/commands/prune.js
Implementation:
// Pseudo-code
export async function pruneCommand(options) {
// 1. Find expired memories
// 2. If --dry-run, show what would be deleted
// 3. Else, prompt for confirmation (unless --force)
// 4. Delete expired memories
// 5. Show count of deleted memories
}
Features:
- Find expired memories (expires_at <= now)
- --dry-run flag (show without deleting)
- --force flag (skip confirmation)
- Confirmation prompt
- Report deleted count
Checkpoint:
memory store "Temp" --expires "2020-01-01"
memory prune --dry-run
# Should show the expired memory
memory prune --force
# Should delete it
Step 1.7: CLI Integration
Effort: 2 hours
Status: Not Started
Files to create:
src/cli.jsbin/memory
Implementation:
// src/cli.js
import { Command } from 'commander';
const program = new Command();
program
.name('memory')
.description('AI Agent Memory System')
.version('1.0.0');
program
.command('store <content>')
.description('Store a new memory')
.option('-t, --tags <tags>', 'Comma-separated tags')
.option('-e, --expires <date>', 'Expiration date')
.option('--by <agent>', 'Agent name')
.action(storeCommand);
program
.command('search <query>')
.description('Search memories')
.option('-t, --tags <tags>', 'Filter by tags')
.option('--after <date>', 'Created after')
.option('--before <date>', 'Created before')
.option('--entered-by <agent>', 'Filter by agent')
.option('-l, --limit <n>', 'Max results', '10')
.action(searchCommand);
// ... other commands
program.parse();
Features:
- All commands registered
- Global options (--db, --verbose, --json)
- Help text for all commands
- Error handling
- Exit codes (0=success, 1=error)
Checkpoint:
memory --help
# Should show all commands
memory store --help
# Should show store options
Step 1.8: Testing & Polish
Effort: 2 hours
Status: Not Started
Note: Integration tests written first for each feature (TDD approach).
This step is for final polish and comprehensive scenarios.
Files to enhance:
test/integration.test.js(should already have tests from Steps 1.3-1.6)test/helpers/seed.js- Realistic data generationtest/fixtures/realistic-memories.js- Memory templates
Comprehensive test scenarios:
- Full workflow: store → search → list → prune
- Performance: 100 memories, search <50ms
- Edge cases: empty query, no results, expired memories
- Data validation: content length, invalid dates, malformed tags
- Tag normalization: uppercase → lowercase, duplicates
- Expiration: auto-filter in search, prune removes correctly
Checkpoint: All tests pass with npm test, >80% coverage (mostly integration)
Phase 1 Completion Criteria
- All checkpoints passed
- Can store memories with tags and expiration
- Can search with basic LIKE matching
- Can list recent memories
- Can prune expired memories
- Help text comprehensive
- Tests passing (>80% coverage)
- Database file created at ~/.config/opencode/memories.db
Validation test:
# Full workflow test
memory store "Docker Compose uses bridge networks by default" --tags docker,networking
memory store "Kubernetes pods share network namespace" --tags kubernetes,networking
memory search "networking" --tags docker
# Should return only Docker memory
memory list --limit 10
# Should show both memories
memory stats
# Should show 2 memories, 2 unique tags
Phase 2: FTS5 Migration
Goal: Production-grade search with FTS5
Status: Not Started
Trigger to Start: Dataset > 500 memories OR query latency > 500ms OR manual request
Step 2.1: Migration Script
Effort: 2 hours
Status: Not Started
Files to create:
src/db/migrations.jssrc/db/migrations/002_fts5.js
Implementation:
export async function migrateToFTS5(db) {
console.log('Migrating to FTS5...');
// 1. Check if already migrated
const version = db.prepare('SELECT value FROM metadata WHERE key = ?').get('schema_version');
if (version.value >= 2) {
console.log('Already on FTS5');
return;
}
// 2. Create FTS5 table
db.exec(`CREATE VIRTUAL TABLE memories_fts USING fts5(...)`);
// 3. Populate from existing memories
db.exec(`INSERT INTO memories_fts(rowid, content) SELECT id, content FROM memories`);
// 4. Create triggers
db.exec(`CREATE TRIGGER memories_ai AFTER INSERT...`);
db.exec(`CREATE TRIGGER memories_ad AFTER DELETE...`);
db.exec(`CREATE TRIGGER memories_au AFTER UPDATE...`);
// 5. Update schema version
db.prepare('UPDATE metadata SET value = ? WHERE key = ?').run('2', 'schema_version');
console.log('Migration complete!');
}
Checkpoint: Run migration on test DB, verify FTS5 table exists and is populated
Step 2.2: FTS5 Search Implementation
Effort: 3 hours
Status: Not Started
Files to create:
src/search/fts.js
Features:
- FTS5 MATCH query builder
- Support boolean operators (AND/OR/NOT)
- Phrase queries ("exact phrase")
- Prefix matching (docker*)
- BM25 relevance ranking
- Combined with metadata filters
Checkpoint: FTS5 search returns results ranked by relevance
Step 2.3: CLI Command - Migrate
Effort: 1 hour
Status: Not Started
Files to create:
src/commands/migrate.js
Implementation:
memory migrate fts5
# Prompts for confirmation, runs migration
Checkpoint: Command successfully migrates Phase 1 DB to Phase 2
Phase 3: Fuzzy Layer
Goal: Handle typos and inexact matches
Status: Not Started
Trigger to Start: Manual request or need for fuzzy matching
Step 3.1: Trigram Infrastructure
Effort: 3 hours
Status: Not Started
Files to create:
src/db/migrations/003_trigrams.jssrc/search/fuzzy.js
Features:
- Trigram table creation
- Trigram extraction function
- Populate trigrams from existing memories
- Trigger to maintain trigrams on insert/update
Step 3.2: Fuzzy Search Implementation
Effort: 4 hours
Status: Not Started
Features:
- Trigram similarity calculation
- Levenshtein distance implementation
- Combined relevance scoring
- Cascade logic (exact → fuzzy)
- Configurable threshold
Step 3.3: CLI Integration
Effort: 2 hours
Status: Not Started
Features:
- --fuzzy flag for search command
- --threshold option
- Auto-fuzzy when <5 results
Additional Features (Post-MVP)
Stats Command
Effort: 2 hours
Status: Not Started
memory stats
# Total memories: 1,234
# Total tags: 56
# Database size: 2.3 MB
# Most used tags: docker (123), kubernetes (89), nodejs (67)
memory stats --tags
# docker: 123
# kubernetes: 89
# nodejs: 67
# ...
memory stats --agents
# investigate-agent: 456
# optimize-agent: 234
# manual: 544
Export/Import Commands
Effort: 3 hours
Status: Not Started
memory export memories.json
# Exported 1,234 memories to memories.json
memory import memories.json
# Imported 1,234 memories
Agent Context Documentation
Effort: 3 hours
Status: Not Started
Files to create:
docs/AGENT_GUIDE.mdsrc/commands/agent-context.js
memory --agent-context
# Displays comprehensive guide for AI agents
Auto-Extraction (Remember Pattern)
Effort: 4 hours
Status: Not Started
Files to create:
src/extractors/remember.js
Features:
- Regex pattern to detect
*Remember*: [fact] - Auto-extract tags from content
- Auto-detect expiration dates
- Store extracted memories
- Report extraction results
OpenCode Plugin Integration
Effort: 3 hours
Status: Not Started
Files to create:
plugin.js(root level for OpenCode)
Features:
- Plugin registration
- API exposure (store, search, extractRemember)
- Lifecycle hooks (onInstall, onUninstall)
- Command registration
Testing Strategy
TDD Philosophy: Integration-First Approach
Core Principles:
- Integration tests are primary - Test real workflows end-to-end
- Unit tests are rare - Only for complex algorithms (fuzzy matching, trigrams, Levenshtein)
- Test with real data - Use SQLite :memory: or temp files with realistic scenarios
- Watch-driven development - Run tests in watch mode, see failures, implement, see success
Testing Workflow:
# 1. Write integration test first (it will fail)
npm run test:watch
# 2. Run program manually to see behavior
node src/cli.js store "test"
# 3. Implement feature
# 4. Watch tests pass
# 5. Refine based on output
Integration Tests (Primary)
Coverage target: All major workflows
Test approach:
- Use real SQLite database (
:memory:for speed, temp file for persistence tests) - Simulate realistic data (10-100 memories per test)
- Test actual CLI commands via Node API
- Verify end-to-end behavior, not internal implementation
Test scenarios:
// test/integration.test.js
describe('Memory System Integration', () => {
test('store and retrieve workflow', async () => {
// Store memory
await cli(['store', 'Docker uses bridge networks', '--tags', 'docker,networking']);
// Search for it
const results = await cli(['search', 'docker']);
// Verify output
expect(results).toContain('Docker uses bridge networks');
expect(results).toContain('docker');
expect(results).toContain('networking');
});
test('realistic dataset search performance', async () => {
// Insert 100 realistic memories
for (let i = 0; i < 100; i++) {
await storeMemory(generateRealisticMemory());
}
// Search should be fast
const start = Date.now();
await cli(['search', 'docker']);
const duration = Date.now() - start;
expect(duration).toBeLessThan(50); // Phase 1 target
});
});
Test data generation:
// test/fixtures/realistic-memories.js
export function generateRealisticMemory() {
const templates = [
{ content: 'Docker Compose requires explicit subnet config when using multiple networks', tags: ['docker', 'networking'] },
{ content: 'PostgreSQL VACUUM FULL locks tables, use ANALYZE instead', tags: ['postgresql', 'performance'] },
{ content: 'Git worktree allows parallel branches without stashing', tags: ['git', 'workflow'] },
// ... 50+ realistic templates
];
return randomChoice(templates);
}
Unit Tests (Rare - Only When Necessary)
When to write unit tests:
- Complex algorithms with edge cases (Levenshtein distance, trigram extraction)
- Pure functions with clear inputs/outputs
- Critical validation logic
When NOT to write unit tests:
- Database queries (covered by integration tests)
- CLI parsing (covered by integration tests)
- Simple utilities (tag parsing, date formatting)
Example unit test (justified):
// test/unit/fuzzy.test.js - Complex algorithm worth unit testing
describe('Levenshtein distance', () => {
test('calculates edit distance correctly', () => {
expect(levenshtein('docker', 'dcoker')).toBe(2);
expect(levenshtein('kubernetes', 'kuberntes')).toBe(2);
expect(levenshtein('same', 'same')).toBe(0);
});
test('handles edge cases', () => {
expect(levenshtein('', 'hello')).toBe(5);
expect(levenshtein('a', '')).toBe(1);
});
});
Test Data Management
For integration tests:
// Use :memory: database for fast, isolated tests
beforeEach(() => {
db = new Database(':memory:');
initSchema(db);
});
// Or use temp file for persistence testing
import { mkdtempSync } from 'fs';
import { join } from 'path';
import { tmpdir } from 'os';
beforeEach(() => {
const tempDir = mkdtempSync(join(tmpdir(), 'llmemory-test-'));
dbPath = join(tempDir, 'test.db');
db = new Database(dbPath);
initSchema(db);
});
afterEach(() => {
db.close();
// Cleanup temp files
});
Realistic data seeding:
// test/helpers/seed.js
export async function seedDatabase(db, count = 50) {
const memories = [];
for (let i = 0; i < count; i++) {
memories.push({
content: generateRealisticMemory(),
tags: generateRealisticTags(),
entered_by: randomChoice(['investigate-agent', 'optimize-agent', 'manual']),
created_at: Date.now() - randomInt(0, 90 * 86400) // Random within 90 days
});
}
// Bulk insert
const insert = db.transaction((memories) => {
for (const memory of memories) {
storeMemory(db, memory);
}
});
insert(memories);
return memories;
}
Performance Tests
Run after each phase:
// Benchmark search latency
test('Phase 1 search <50ms for 500 memories', async () => {
// Insert 500 test memories
const start = Date.now();
const results = await search('test query');
const duration = Date.now() - start;
expect(duration).toBeLessThan(50);
});
test('Phase 2 search <100ms for 10K memories', async () => {
// Insert 10K test memories
const start = Date.now();
const results = await search('test query');
const duration = Date.now() - start;
expect(duration).toBeLessThan(100);
});
Documentation Roadmap
Phase 1 Docs
- README.md - Quick start, installation, basic usage
- CLI_REFERENCE.md - All commands and options
- ARCHITECTURE.md - System design, schema, algorithms
Phase 2 Docs
- AGENT_GUIDE.md - Comprehensive guide for AI agents
- MIGRATION_GUIDE.md - Phase 1 → 2 → 3 instructions
- QUERY_SYNTAX.md - FTS5 query patterns
Phase 3 Docs
- API.md - Programmatic API for plugins
- CONTRIBUTING.md - Development setup, testing
- TROUBLESHOOTING.md - Common issues and solutions
Success Metrics
Phase 1 (MVP)
- Can store/retrieve memories
- Search works for exact matches
- Performance: <50ms for 500 memories
- Test coverage: >80%
- No critical bugs
Phase 2 (FTS5)
- Migration completes without data loss
- Search quality improved (relevance ranking)
- Performance: <100ms for 10K memories
- Boolean operators work correctly
Phase 3 (Fuzzy)
- Typos correctly matched (edit distance ≤2)
- Fuzzy cascade improves result count
- Performance: <200ms for 10K memories
- No false positives (threshold tuned)
Overall
- Agents use system regularly in workflows
- Search results are high-quality (relevant)
- Token-efficient (limited, ranked results)
- No performance complaints
- Documentation comprehensive
Development Workflow
Daily Checklist
- Pull latest changes
- Run tests:
npm test - Work on current step
- Write/update tests
- Update this document (mark checkboxes)
- Commit with clear message
- Update CHANGELOG.md
Before Phase Completion
- All checkpoints passed
- Tests passing (>80% coverage)
- Documentation updated
- Performance benchmarks run
- Manual testing completed
- Changelog updated
Commit Message Format
<type>(<scope>): <subject>
Examples:
feat(search): implement FTS5 search with BM25 ranking
fix(store): validate content length before insertion
docs(readme): add installation instructions
test(search): add integration tests for filters
refactor(db): extract connection logic to separate file
Troubleshooting
Common Issues
Issue: SQLite FTS5 not available
Solution: Ensure SQLite version ≥3.35, check better-sqlite3 includes FTS5
Issue: Database locked errors
Solution: Enable WAL mode: PRAGMA journal_mode = WAL
Issue: Slow searches with large dataset
Solution: Check indexes exist, run ANALYZE, consider migration to next phase
Issue: Tag filtering not working Solution: Verify tag normalization (lowercase), check many-to-many joins
Next Session Continuation
For the next developer/AI agent:
- Check Current Phase: Review checkboxes in this file to see progress
- Run Tests:
npm testto verify current state - Check Database:
sqlite3 ~/.config/opencode/memories.db .schemato see current schema version - Review SPECIFICATION.md: Understand overall architecture
- Pick Next Step: Find first unchecked item in current phase
- Update This File: Mark completed checkboxes as you go
Quick Start Commands:
cd llmemory
npm install # Install dependencies
npm test # Run test suite
npm run start -- --help # Test CLI
Current Status: Phase 0 complete (planning/documentation), ready to begin Phase 1 implementation.
Estimated Time to MVP: 12-15 hours of focused development.
Resources
- SQLite FTS5: https://www.sqlite.org/fts5.html
- better-sqlite3: https://github.com/WiseLibs/better-sqlite3
- Commander.js: https://github.com/tj/commander.js
- Vitest: https://vitest.dev/
Changelog
2025-10-29 - Phase 0 Complete
- Project structure defined
- Comprehensive specification written
- Implementation plan created
- Agent investigation reports integrated
- Ready for Phase 1 development