nixos/shared/linked-dotfiles/opencode/llmemory/SPECIFICATION.md
2025-10-29 18:46:16 -06:00

951 lines
26 KiB
Markdown

# LLMemory - AI Agent Memory System
## Overview
LLMemory is a persistent memory/journal system for AI agents, providing grep-like search with fuzzy matching for efficient knowledge retrieval across sessions.
## Core Requirements
### Storage
- Store memories with metadata: `created_at`, `entered_by`, `expires_at`, `tags`
- Local SQLite database (no cloud dependencies)
- Content limit: 10KB per memory
- Tag-based organization with normalized schema
### Retrieval
- Grep/ripgrep-like query syntax (familiar to AI agents)
- Fuzzy matching with configurable threshold
- Relevance ranking (BM25 + edit distance + recency)
- Metadata filtering (tags, dates, agent)
- Token-efficient: limit results, prioritize quality over quantity
### Interface
- Global CLI tool: `memory [command]`
- Commands: `store`, `search`, `list`, `prune`, `stats`, `export`, `import`
- `--agent-context` flag for comprehensive agent documentation
- Output formats: plain text, JSON, markdown
### Integration
- OpenCode plugin architecture
- Expose API for programmatic access
- Auto-extraction of `*Remember*` patterns from agent output
## Implementation Strategy
### Phase 1: MVP (Simple LIKE Search)
**Goal:** Ship in 2-3 days, validate concept with real usage
**Features:**
- Basic schema (memories, tags tables)
- Core commands (store, search, list, prune)
- Simple LIKE-based search with wildcards
- Plain text output
- Tag filtering
- Expiration handling
**Success Criteria:**
- Can store and retrieve memories
- Search works for exact/prefix matches
- Tags functional
- Performance acceptable for <500 memories
**Database:**
```sql
CREATE TABLE memories (
id INTEGER PRIMARY KEY,
content TEXT NOT NULL CHECK(length(content) <= 10000),
created_at INTEGER NOT NULL DEFAULT (strftime('%s', 'now')),
entered_by TEXT,
expires_at INTEGER
);
CREATE TABLE tags (
id INTEGER PRIMARY KEY,
name TEXT UNIQUE COLLATE NOCASE
);
CREATE TABLE memory_tags (
memory_id INTEGER,
tag_id INTEGER,
PRIMARY KEY (memory_id, tag_id),
FOREIGN KEY (memory_id) REFERENCES memories(id) ON DELETE CASCADE,
FOREIGN KEY (tag_id) REFERENCES tags(id) ON DELETE CASCADE
);
```
**Search Logic:**
```javascript
// Simple case-insensitive LIKE with wildcards
WHERE LOWER(content) LIKE LOWER('%' || ? || '%')
AND (expires_at IS NULL OR expires_at > strftime('%s', 'now'))
ORDER BY created_at DESC
```
### Phase 2: FTS5 Migration
**Trigger:** Dataset > 500 memories OR query latency > 500ms
**Features:**
- Add FTS5 virtual table
- Migrate existing data
- Implement BM25 ranking
- Support boolean operators (AND/OR/NOT)
- Phrase queries with quotes
- Prefix matching with `*`
**Database Addition:**
```sql
CREATE VIRTUAL TABLE memories_fts USING fts5(
content,
content='memories',
content_rowid='id',
tokenize='porter unicode61 remove_diacritics 2'
);
-- Triggers to keep in sync
CREATE TRIGGER memories_ai AFTER INSERT ON memories BEGIN
INSERT INTO memories_fts(rowid, content) VALUES (new.id, new.content);
END;
-- ... (update/delete triggers)
```
**Search Logic:**
```javascript
// FTS5 match with BM25 ranking
SELECT m.*, mf.rank
FROM memories_fts mf
JOIN memories m ON m.id = mf.rowid
WHERE memories_fts MATCH ?
ORDER BY mf.rank
```
### Phase 3: Fuzzy Layer
**Goal:** Handle typos and inexact matches
**Features:**
- Trigram indexing
- Levenshtein distance calculation
- Intelligent cascade: exact (FTS5) → fuzzy (trigram)
- Combined relevance scoring
- Configurable threshold (default: 0.7)
**Database Addition:**
```sql
CREATE TABLE trigrams (
trigram TEXT NOT NULL,
memory_id INTEGER NOT NULL,
position INTEGER NOT NULL,
FOREIGN KEY (memory_id) REFERENCES memories(id) ON DELETE CASCADE
);
CREATE INDEX idx_trigrams_trigram ON trigrams(trigram);
```
**Search Logic:**
```javascript
// 1. Try FTS5 exact match
let results = ftsSearch(query);
// 2. If <5 results, try fuzzy
if (results.length < 5) {
const fuzzyResults = trigramSearch(query, threshold);
results = mergeAndDedupe(results, fuzzyResults);
}
// 3. Re-rank by combined score
results.forEach(r => {
r.score = 0.4 * bmr25Score
+ 0.3 * trigramSimilarity
+ 0.2 * editDistanceScore
+ 0.1 * recencyScore;
});
```
## Architecture
### Technology Stack
- **Language:** Node.js (JavaScript/TypeScript)
- **Database:** SQLite with better-sqlite3
- **CLI Framework:** Commander.js
- **Output Formatting:** chalk (colors), marked-terminal (markdown)
- **Date Parsing:** date-fns
- **Testing:** Vitest
### Directory Structure
```
llmemory/
├── src/
│ ├── cli.js # CLI entry point
│ ├── commands/
│ │ ├── store.js
│ │ ├── search.js
│ │ ├── list.js
│ │ ├── prune.js
│ │ ├── stats.js
│ │ └── export.js
│ ├── db/
│ │ ├── connection.js # Database setup
│ │ ├── schema.js # Schema definitions
│ │ ├── migrations.js # Migration runner
│ │ └── queries.js # Prepared statements
│ ├── search/
│ │ ├── like.js # Phase 1: LIKE search
│ │ ├── fts.js # Phase 2: FTS5 search
│ │ ├── fuzzy.js # Phase 3: Fuzzy matching
│ │ └── ranking.js # Relevance scoring
│ ├── utils/
│ │ ├── dates.js
│ │ ├── tags.js
│ │ ├── formatting.js
│ │ └── validation.js
│ └── extractors/
│ └── remember.js # Auto-extract *Remember* patterns
├── test/
│ ├── search.test.js
│ ├── fuzzy.test.js
│ ├── integration.test.js
│ └── fixtures/
├── docs/
│ ├── ARCHITECTURE.md
│ ├── AGENT_GUIDE.md # For --agent-context
│ ├── CLI_REFERENCE.md
│ └── API.md
├── bin/
│ └── memory # Executable
├── package.json
├── SPECIFICATION.md # This file
├── IMPLEMENTATION_PLAN.md
└── README.md
```
### CLI Interface
#### Commands
```bash
# Store a memory
memory store <content> [options]
--tags <tag1,tag2> Comma-separated tags
--expires <date> Expiration date (ISO 8601 or natural language)
--entered-by <agent> Agent/user identifier
--file <path> Read content from file
# Search memories
memory search <query> [options]
--tags <tag1,tag2> Filter by tags (AND)
--any-tag <tag1,tag2> Filter by tags (OR)
--after <date> Created after date
--before <date> Created before date
--entered-by <agent> Filter by creator
--limit <n> Max results (default: 10)
--offset <n> Pagination offset
--fuzzy Enable fuzzy matching (default: auto)
--no-fuzzy Disable fuzzy matching
--threshold <0-1> Fuzzy match threshold (default: 0.7)
--json JSON output
--markdown Markdown output
# List recent memories
memory list [options]
--limit <n> Max results (default: 20)
--offset <n> Pagination offset
--tags <tags> Filter by tags
--sort <field> Sort by: created, expires, content
--order <asc|desc> Sort order (default: desc)
# Prune expired memories
memory prune [options]
--dry-run Show what would be deleted
--force Skip confirmation
--before <date> Delete before date (even if not expired)
# Show statistics
memory stats [options]
--tags Show tag frequency
--agents Show memories per agent
# Export/import
memory export <file> Export to JSON
memory import <file> Import from JSON
# Global options
--agent-context Display agent documentation
--db <path> Custom database location
--verbose Detailed logging
--quiet Suppress non-error output
```
#### Query Syntax
```bash
# Basic
memory search "docker compose" # Both terms (implicit AND)
memory search "docker AND compose" # Explicit AND
memory search "docker OR podman" # Either term
memory search "docker NOT swarm" # Exclude term
memory search '"exact phrase"' # Phrase search
memory search "docker*" # Prefix matching
# With filters
memory search "docker" --tags devops,networking
memory search "error" --after "2025-10-01"
memory search "config" --entered-by investigate-agent
# Fuzzy (automatic typo tolerance)
memory search "dokcer" # Finds "docker"
memory search "kuberntes" # Finds "kubernetes"
```
### Data Schema
#### Complete Schema (All Phases)
```sql
-- Core tables
CREATE TABLE memories (
id INTEGER PRIMARY KEY AUTOINCREMENT,
content TEXT NOT NULL CHECK(length(content) <= 10000),
created_at INTEGER NOT NULL DEFAULT (strftime('%s', 'now')),
entered_by TEXT,
expires_at INTEGER,
CHECK(expires_at IS NULL OR expires_at > created_at)
);
CREATE INDEX idx_memories_created ON memories(created_at DESC);
CREATE INDEX idx_memories_expires ON memories(expires_at) WHERE expires_at IS NOT NULL;
CREATE INDEX idx_memories_entered_by ON memories(entered_by);
CREATE TABLE tags (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL UNIQUE COLLATE NOCASE,
created_at INTEGER NOT NULL DEFAULT (strftime('%s', 'now'))
);
CREATE INDEX idx_tags_name ON tags(name);
CREATE TABLE memory_tags (
memory_id INTEGER NOT NULL,
tag_id INTEGER NOT NULL,
PRIMARY KEY (memory_id, tag_id),
FOREIGN KEY (memory_id) REFERENCES memories(id) ON DELETE CASCADE,
FOREIGN KEY (tag_id) REFERENCES tags(id) ON DELETE CASCADE
);
CREATE INDEX idx_memory_tags_tag ON memory_tags(tag_id);
-- Phase 2: FTS5
CREATE VIRTUAL TABLE memories_fts USING fts5(
content,
content='memories',
content_rowid='id',
tokenize='porter unicode61 remove_diacritics 2'
);
CREATE TRIGGER memories_ai AFTER INSERT ON memories BEGIN
INSERT INTO memories_fts(rowid, content) VALUES (new.id, new.content);
END;
CREATE TRIGGER memories_ad AFTER DELETE ON memories BEGIN
DELETE FROM memories_fts WHERE rowid = old.id;
END;
CREATE TRIGGER memories_au AFTER UPDATE ON memories BEGIN
DELETE FROM memories_fts WHERE rowid = old.id;
INSERT INTO memories_fts(rowid, content) VALUES (new.id, new.content);
END;
-- Phase 3: Trigrams
CREATE TABLE trigrams (
trigram TEXT NOT NULL,
memory_id INTEGER NOT NULL,
position INTEGER NOT NULL,
FOREIGN KEY (memory_id) REFERENCES memories(id) ON DELETE CASCADE
);
CREATE INDEX idx_trigrams_trigram ON trigrams(trigram);
CREATE INDEX idx_trigrams_memory ON trigrams(memory_id);
-- Metadata
CREATE TABLE metadata (
key TEXT PRIMARY KEY,
value TEXT NOT NULL
);
INSERT INTO metadata (key, value) VALUES ('schema_version', '1');
INSERT INTO metadata (key, value) VALUES ('created_at', strftime('%s', 'now'));
-- Useful view
CREATE VIEW memories_with_tags AS
SELECT
m.id,
m.content,
m.created_at,
m.entered_by,
m.expires_at,
GROUP_CONCAT(t.name, ',') as tags
FROM memories m
LEFT JOIN memory_tags mt ON m.id = mt.memory_id
LEFT JOIN tags t ON mt.tag_id = t.id
GROUP BY m.id;
```
## Search Algorithm Details
### Phase 1: LIKE Search
```javascript
function searchWithLike(query, filters = {}) {
const { tags = [], after, before, enteredBy, limit = 10 } = filters;
let sql = `
SELECT DISTINCT m.id, m.content, m.created_at, m.entered_by, m.expires_at,
GROUP_CONCAT(t.name, ',') as tags
FROM memories m
LEFT JOIN memory_tags mt ON m.id = mt.memory_id
LEFT JOIN tags t ON mt.tag_id = t.id
WHERE LOWER(m.content) LIKE LOWER(?)
AND (m.expires_at IS NULL OR m.expires_at > strftime('%s', 'now'))
`;
const params = [`%${query}%`];
// Tag filtering
if (tags.length > 0) {
sql += ` AND m.id IN (
SELECT memory_id FROM memory_tags
WHERE tag_id IN (SELECT id FROM tags WHERE name IN (${tags.map(() => '?').join(',')}))
GROUP BY memory_id
HAVING COUNT(*) = ?
)`;
params.push(...tags, tags.length);
}
// Date filtering
if (after) {
sql += ' AND m.created_at >= ?';
params.push(after);
}
if (before) {
sql += ' AND m.created_at <= ?';
params.push(before);
}
// Agent filtering
if (enteredBy) {
sql += ' AND m.entered_by = ?';
params.push(enteredBy);
}
sql += ' GROUP BY m.id ORDER BY m.created_at DESC LIMIT ?';
params.push(limit);
return db.prepare(sql).all(...params);
}
```
### Phase 2: FTS5 Search
```javascript
function searchWithFTS5(query, filters = {}) {
const ftsQuery = buildFTS5Query(query);
let sql = `
SELECT m.id, m.content, m.created_at, m.entered_by, m.expires_at,
GROUP_CONCAT(t.name, ',') as tags,
mf.rank as relevance
FROM memories_fts mf
JOIN memories m ON m.id = mf.rowid
LEFT JOIN memory_tags mt ON m.id = mt.memory_id
LEFT JOIN tags t ON mt.tag_id = t.id
WHERE memories_fts MATCH ?
AND (m.expires_at IS NULL OR m.expires_at > strftime('%s', 'now'))
`;
const params = [ftsQuery];
// Apply filters (same as Phase 1)
// ...
sql += ' GROUP BY m.id ORDER BY mf.rank LIMIT ?';
params.push(limit);
return db.prepare(sql).all(...params);
}
function buildFTS5Query(query) {
// Handle quoted phrases
if (query.includes('"')) {
return query; // Already FTS5 compatible
}
// Handle explicit operators
if (/\b(AND|OR|NOT)\b/i.test(query)) {
return query.toUpperCase();
}
// Implicit AND between terms
const terms = query.split(/\s+/).filter(t => t.length > 0);
return terms.join(' AND ');
}
```
### Phase 3: Fuzzy Search
```javascript
function searchWithFuzzy(query, threshold = 0.7, limit = 10) {
const queryTrigrams = extractTrigrams(query);
if (queryTrigrams.length === 0) return [];
// Find candidates by trigram overlap
const sql = `
SELECT
m.id,
m.content,
m.created_at,
m.entered_by,
m.expires_at,
COUNT(DISTINCT tr.trigram) as trigram_matches
FROM memories m
JOIN trigrams tr ON tr.memory_id = m.id
WHERE tr.trigram IN (${queryTrigrams.map(() => '?').join(',')})
AND (m.expires_at IS NULL OR m.expires_at > strftime('%s', 'now'))
GROUP BY m.id
HAVING trigram_matches >= ?
ORDER BY trigram_matches DESC
LIMIT ?
`;
const minMatches = Math.ceil(queryTrigrams.length * threshold);
const candidates = db.prepare(sql).all(...queryTrigrams, minMatches, limit * 2);
// Calculate edit distance and combined score
const scored = candidates.map(c => {
const editDist = levenshtein(query.toLowerCase(), c.content.toLowerCase().substring(0, query.length * 3));
const trigramSim = c.trigram_matches / queryTrigrams.length;
const normalizedEditDist = 1 - (editDist / Math.max(query.length, c.content.length));
return {
...c,
relevance: 0.6 * trigramSim + 0.4 * normalizedEditDist
};
});
return scored
.filter(r => r.relevance >= threshold)
.sort((a, b) => b.relevance - a.relevance)
.slice(0, limit);
}
function extractTrigrams(text) {
const normalized = text
.toLowerCase()
.replace(/[^\w\s]/g, ' ')
.replace(/\s+/g, ' ')
.trim();
if (normalized.length < 3) return [];
const padded = ` ${normalized} `;
const trigrams = [];
for (let i = 0; i < padded.length - 2; i++) {
const trigram = padded.substring(i, i + 3);
if (trigram.trim().length === 3) {
trigrams.push(trigram);
}
}
return [...new Set(trigrams)]; // Deduplicate
}
function levenshtein(a, b) {
if (a.length === 0) return b.length;
if (b.length === 0) return a.length;
let prevRow = Array(b.length + 1).fill(0).map((_, i) => i);
for (let i = 0; i < a.length; i++) {
let curRow = [i + 1];
for (let j = 0; j < b.length; j++) {
const cost = a[i] === b[j] ? 0 : 1;
curRow.push(Math.min(
curRow[j] + 1, // deletion
prevRow[j + 1] + 1, // insertion
prevRow[j] + cost // substitution
));
}
prevRow = curRow;
}
return prevRow[b.length];
}
```
### Intelligent Cascade
```javascript
function search(query, filters = {}) {
const { fuzzy = 'auto', threshold = 0.7 } = filters;
// Phase 2 or Phase 3 installed?
const hasFTS5 = checkTableExists('memories_fts');
const hasTrigrams = checkTableExists('trigrams');
let results;
// Try FTS5 if available
if (hasFTS5) {
results = searchWithFTS5(query, filters);
} else {
results = searchWithLike(query, filters);
}
// If too few results and fuzzy available, try fuzzy
if (results.length < 5 && hasTrigrams && (fuzzy === 'auto' || fuzzy === true)) {
const fuzzyResults = searchWithFuzzy(query, threshold, filters.limit);
results = mergeResults(results, fuzzyResults);
}
return results;
}
function mergeResults(exact, fuzzy) {
const seen = new Set(exact.map(r => r.id));
const merged = [...exact];
for (const result of fuzzy) {
if (!seen.has(result.id)) {
merged.push(result);
seen.add(result.id);
}
}
return merged;
}
```
## Memory Format Guidelines
### Good Memory Examples
```bash
# Technical discovery with context
memory store "Docker Compose: Use 'depends_on' with 'condition: service_healthy' to ensure dependencies are ready. Prevents race conditions in multi-container apps." \
--tags docker,docker-compose,best-practices
# Configuration pattern
memory store "Nginx reverse proxy: Set 'proxy_set_header X-Real-IP \$remote_addr' to preserve client IP through proxy. Required for rate limiting and logging." \
--tags nginx,networking,security
# Error resolution
memory store "Node.js ENOSPC: Increase inotify watch limit with 'echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p'. Affects webpack, nodemon." \
--tags nodejs,linux,troubleshooting
# Version-specific behavior
memory store "TypeScript 5.0+: 'const' type parameters preserve literal types. Example: 'function id<const T>(x: T): T'. Better inference for generic functions." \
--tags typescript,types
# Temporary info with expiration
memory store "Staging server: https://staging.example.com:8443. Credentials in 1Password. Valid through Q1 2025." \
--tags staging,infrastructure \
--expires "2025-04-01"
```
### Anti-Patterns to Avoid
```bash
# Too vague
❌ memory store "Fixed Docker issue"
✅ memory store "Docker: Use 'docker system prune -a' to reclaim space. Removes unused images, containers, networks."
# Widely known
❌ memory store "Git is a version control system"
✅ memory store "Git worktree: 'git worktree add -b feature ../feature' creates parallel working dir without cloning."
# Sensitive data
❌ memory store "DB password: hunter2"
✅ memory store "Production DB credentials stored in 1Password vault 'Infrastructure'"
# Multiple unrelated facts
❌ memory store "Docker uses namespaces. K8s has pods. Nginx is fast."
✅ memory store "Docker container isolation uses Linux namespaces: PID, NET, MNT, UTS, IPC."
```
## Auto-Extraction: *Remember* Pattern
When agents output text containing `*Remember*: [fact]`, automatically extract and store:
```javascript
function extractRememberPatterns(text, context = {}) {
const rememberRegex = /\*Remember\*:?\s+(.+?)(?=\n\n|\*Remember\*|$)/gis;
const matches = [...text.matchAll(rememberRegex)];
return matches.map(match => {
const content = match[1].trim();
const tags = autoExtractTags(content, context);
const expires = autoExtractExpiration(content);
return {
content,
tags,
expires,
entered_by: context.agentName || 'auto-extract'
};
});
}
function autoExtractTags(content, context) {
const tags = new Set();
// Technology patterns
const techPatterns = {
'docker': /docker|container|compose/i,
'kubernetes': /k8s|kubernetes|kubectl/i,
'git': /\bgit\b|github|gitlab/i,
'nodejs': /node\.?js|npm|yarn/i,
'postgresql': /postgres|postgresql/i,
'nixos': /nix|nixos|flake/i
};
for (const [tag, pattern] of Object.entries(techPatterns)) {
if (pattern.test(content)) tags.add(tag);
}
// Category patterns
if (/error|bug|fix/i.test(content)) tags.add('troubleshooting');
if (/performance|optimize/i.test(content)) tags.add('performance');
if (/security|vulnerability/i.test(content)) tags.add('security');
return Array.from(tags);
}
function autoExtractExpiration(content) {
const patterns = [
{ re: /valid (through|until) (\w+ \d{4})/i, parse: m => new Date(m[2]) },
{ re: /expires? (on )?([\d-]+)/i, parse: m => new Date(m[2]) },
{ re: /temporary|temp/i, parse: () => addDays(new Date(), 90) },
{ re: /Q([1-4]) (\d{4})/i, parse: m => quarterEnd(m[1], m[2]) }
];
for (const { re, parse } of patterns) {
const match = content.match(re);
if (match) {
try {
return parse(match).toISOString();
} catch {}
}
}
return null;
}
```
## Migration Strategy
### Phase 1 → Phase 2 (LIKE → FTS5)
```javascript
async function migrateToFTS5(db) {
console.log('Migrating to FTS5...');
// Create FTS5 table
db.exec(`
CREATE VIRTUAL TABLE memories_fts USING fts5(
content,
content='memories',
content_rowid='id',
tokenize='porter unicode61 remove_diacritics 2'
);
`);
// Populate from existing data
db.exec(`
INSERT INTO memories_fts(rowid, content)
SELECT id, content FROM memories;
`);
// Create triggers
db.exec(`
CREATE TRIGGER memories_ai AFTER INSERT ON memories BEGIN
INSERT INTO memories_fts(rowid, content) VALUES (new.id, new.content);
END;
CREATE TRIGGER memories_ad AFTER DELETE ON memories BEGIN
DELETE FROM memories_fts WHERE rowid = old.id;
END;
CREATE TRIGGER memories_au AFTER UPDATE ON memories BEGIN
DELETE FROM memories_fts WHERE rowid = old.id;
INSERT INTO memories_fts(rowid, content) VALUES (new.id, new.content);
END;
`);
// Update schema version
db.prepare('UPDATE metadata SET value = ? WHERE key = ?').run('2', 'schema_version');
console.log('FTS5 migration complete!');
}
```
### Phase 2 → Phase 3 (Add Trigrams)
```javascript
async function migrateToTrigrams(db) {
console.log('Adding trigram support...');
// Create trigrams table
db.exec(`
CREATE TABLE trigrams (
trigram TEXT NOT NULL,
memory_id INTEGER NOT NULL,
position INTEGER NOT NULL,
FOREIGN KEY (memory_id) REFERENCES memories(id) ON DELETE CASCADE
);
CREATE INDEX idx_trigrams_trigram ON trigrams(trigram);
CREATE INDEX idx_trigrams_memory ON trigrams(memory_id);
`);
// Populate from existing memories
const memories = db.prepare('SELECT id, content FROM memories').all();
const insertTrigram = db.prepare('INSERT INTO trigrams (trigram, memory_id, position) VALUES (?, ?, ?)');
const insertMany = db.transaction((memories) => {
for (const memory of memories) {
const trigrams = extractTrigrams(memory.content);
trigrams.forEach((trigram, position) => {
insertTrigram.run(trigram, memory.id, position);
});
}
});
insertMany(memories);
// Update schema version
db.prepare('UPDATE metadata SET value = ? WHERE key = ?').run('3', 'schema_version');
console.log('Trigram migration complete!');
}
```
## Performance Targets
### Latency
- Phase 1 (LIKE): <50ms for <500 memories
- Phase 2 (FTS5): <100ms for 10K memories
- Phase 3 (Fuzzy): <200ms for 10K memories with fuzzy
### Storage
- Base: ~500 bytes per memory (average)
- FTS5 index: +30% overhead (~150 bytes)
- Trigrams: +200% overhead (~1KB) - prune common trigrams
### Scalability
- Phase 1: Up to 500 memories
- Phase 2: Up to 50K memories
- Phase 3: Up to 100K+ memories
## Testing Strategy
### Unit Tests
- Search algorithms (LIKE, FTS5, fuzzy)
- Trigram extraction
- Levenshtein distance
- Tag filtering
- Date parsing
- Relevance scoring
### Integration Tests
- Store retrieve flow
- Search with various filters
- Expiration pruning
- Export/import
- Migration Phase 123
### Performance Tests
- Benchmark with 1K, 10K, 100K memories
- Query latency measurement
- Index size monitoring
- Memory usage profiling
## OpenCode Integration
### Plugin Structure
```javascript
// plugin.js - OpenCode plugin entry point
export default {
name: 'llmemory',
version: '1.0.0',
description: 'Persistent memory system for AI agents',
commands: {
'memory': './src/cli.js'
},
api: {
store: async (content, options) => {
const { storeMemory } = await import('./src/db/queries.js');
return storeMemory(content, options);
},
search: async (query, options) => {
const { search } = await import('./src/search/index.js');
return search(query, options);
},
extractRemember: async (text, context) => {
const { extractRememberPatterns } = await import('./src/extractors/remember.js');
return extractRememberPatterns(text, context);
}
},
onInstall: async () => {
const { initDatabase } = await import('./src/db/connection.js');
await initDatabase();
console.log('LLMemory installed! Try: memory --agent-context');
}
};
```
### Usage from Other Plugins
```javascript
import llmemory from '@opencode/llmemory';
// Store a memory
await llmemory.api.store(
'NixOS: flake.lock must be committed for reproducible builds',
{ tags: ['nixos', 'build-system'], entered_by: 'investigate-agent' }
);
// Search
const results = await llmemory.api.search('nixos builds', {
tags: ['nixos'],
limit: 5
});
// Auto-extract from agent output
const memories = await llmemory.api.extractRemember(agentOutput, {
agentName: 'optimize-agent',
currentTask: 'performance-tuning'
});
```
## Next Steps
1. Create project directory and documentation
2. **Implement MVP (Phase 1)**: Basic CLI, LIKE search, core commands
3. **Test with real usage**: Validate concept, collect metrics
4. **Migrate to FTS5 (Phase 2)**: When dataset > 500 or latency issues
5. **Add fuzzy layer (Phase 3)**: For production-quality search
6. **OpenCode integration**: Plugin API and auto-extraction
7. **Documentation**: Complete agent guide, CLI reference, API docs
## Success Metrics
- **Usability**: Agents can store/retrieve memories intuitively
- **Quality**: Search returns relevant results, not noise
- **Performance**: Queries complete in <100ms for typical datasets
- **Adoption**: Agents use memory system regularly in workflows
- **Token Efficiency**: Results are high-quality, limited in quantity