n8r/nixos

Nate Anderson 02f7a3b960 Add memory and revamp skills plugin

2025-10-29 18:46:16 -06:00

9.6 KiB

Raw Permalink Blame History

SPIKE Investigation Workflow

What is a SPIKE?

A SPIKE ticket is a time-boxed research and investigation task. The goal is to explore a problem space, evaluate solution approaches, and create an actionable plan for implementation.

SPIKE = Investigation only. No code changes.

Key Principles

1. Exploration Over Implementation

Focus on understanding the problem deeply
Consider multiple solution approaches (3-5)
Don't commit to first idea
Think creatively about alternatives

2. Documentation Over Code

Document findings thoroughly
Provide specific code references (file:line)
Explain trade-offs objectively
Create actionable implementation plan

3. Developer Approval Required

Always review findings with developer before creating tickets
Developer has final say on implementation approach
Get explicit approval before creating follow-up tickets
Typically results in just 1 follow-up ticket

4. No Code Changes

✅ Read and explore codebase
✅ Document findings
✅ Create implementation plan
❌ Write implementation code
❌ Create git worktree
❌ Create PR

Investigation Process

Phase 1: Problem Understanding

Understand current state:

Read ticket description thoroughly
Explore relevant codebase areas
Identify constraints and dependencies
Document current implementation

Ask questions:

What problem are we solving?
Who is affected?
What are the constraints?
What's the desired outcome?

Phase 2: Approach Exploration

Explore 3-5 different approaches:

For each approach, document:

Name: Brief descriptive name
Description: How it works
Pros: Benefits and advantages
Cons: Drawbacks and challenges
Effort: Relative complexity (S/M/L/XL)
Code locations: Specific file:line references

Think broadly:

Conventional approaches
Creative/unconventional approaches
Simple vs. complex solutions
Short-term vs. long-term solutions

Phase 3: Trade-off Analysis

Evaluate objectively:

Implementation complexity
Performance implications
Maintenance burden
Testing requirements
Migration/rollout complexity
Team familiarity with approach
Long-term sustainability

Be honest about cons:

Every approach has trade-offs
Document them clearly
Don't hide problems

Phase 4: Recommendation

Make clear recommendation:

Which approach is best
Why it's superior to alternatives
Key risks and mitigations
Confidence level (Low/Medium/High)

Justify recommendation:

Reference specific trade-offs
Explain why pros outweigh cons
Consider team context

Phase 5: Implementation Planning

Create actionable plan:

Typically breaks down into 1 follow-up ticket
Occasionally 2-3 if clearly independent tasks
Never many vague tickets

For each ticket, include:

Clear summary
Detailed description
Recommended approach
Acceptance criteria
Code references from investigation
Effort estimate (S/M/L/XL)

Investigation Output Template

## Investigation Findings - PI-XXXXX

### Problem Analysis
[Current state description with file:line references]
[Problem statement]
[Constraints and requirements]

### Approaches Considered

#### 1. [Approach Name]
- **Description**: [How it works]
- **Pros**:
  - [Benefit 1]
  - [Benefit 2]
- **Cons**:
  - [Drawback 1]
  - [Drawback 2]
- **Effort**: [S/M/L/XL]
- **Code**: [file.ext:123, file.ext:456]

#### 2. [Approach Name]
[Repeat structure for each approach]

[Continue for 3-5 approaches]

### Recommendation

**Recommended Approach**: [Approach Name]

**Justification**: [Why this is best, referencing specific trade-offs]

**Risks**:
- [Risk 1]: [Mitigation]
- [Risk 2]: [Mitigation]

**Confidence**: [Low/Medium/High]

### Proposed Implementation

Typically **1 follow-up ticket**:

**Summary**: [Concise task description]

**Description**:
[Detailed implementation plan]
[Step-by-step approach]
[Key considerations]

**Acceptance Criteria**:
- [ ] [Criterion 1]
- [ ] [Criterion 2]
- [ ] [Criterion 3]

**Effort Estimate**: [S/M/L/XL]

**Code References**:
- [file.ext:123 - Description]
- [file.ext:456 - Description]

### References
- [Documentation link]
- [Related ticket]
- [External resource]

Example SPIKE Investigation

Problem

Performance degradation in user search with large datasets (10k+ users)

Approaches Considered

1. Database Query Optimization

Description: Add indexes, optimize JOIN queries, use query caching
Pros:
- Minimal code changes
- Works with existing architecture
- Can be implemented incrementally
Cons:
- Limited scalability (still hits DB for each search)
- Query complexity increases with features
- Cache invalidation complexity
Effort: M
Code: user_service.go:245, user_repository.go:89

2. Elasticsearch Integration

Description: Index users in Elasticsearch, use for all search operations
Pros:
- Excellent search performance at scale
- Full-text search capabilities
- Faceted search support
Cons:
- New infrastructure to maintain
- Data sync complexity
- Team learning curve
- Higher operational cost
Effort: XL
Code: Would be new service, interfaces at user_service.go:200

3. In-Memory Cache with Background Sync

Description: Maintain searchable user cache in memory, sync periodically
Pros:
- Very fast search performance
- No additional infrastructure
- Simple implementation
Cons:
- Memory usage on app servers
- Eventual consistency issues
- Cache warming on deploy
- Doesn't scale past single-server memory
Effort: L
Code: New cache_service.go, integrate at user_service.go:245

4. Materialized View with Triggers

Description: Database materialized view optimized for search, auto-updated via triggers
Pros:
- Good performance
- Consistent data
- Minimal app code changes
Cons:
- Database-specific (PostgreSQL only)
- Trigger complexity
- Harder to debug issues
- Lock contention on high write volume
Effort: M
Code: Migration needed, user_repository.go:89

Recommendation

Recommended Approach: Database Query Optimization (#1)

Justification: Given our current scale (8k users, growing ~20%/year) and team context:

Elasticsearch is over-engineering for current needs - reaches 50k users in ~5 years
In-memory cache has consistency issues that would affect UX
Materialized views add database complexity our team hasn't worked with
Query optimization addresses immediate pain point with minimal risk
Can revisit Elasticsearch if we hit 20k+ users or need full-text features

Risks:

May need to revisit in 2-3 years if growth accelerates: Monitor performance metrics, set alert at 15k users
Won't support advanced search features: Document limitation, plan for future if needed

Confidence: High

Proposed Implementation

1 follow-up ticket:

Summary: Optimize user search queries with indexes and caching

Description:

Add composite index on (last_name, first_name, email)
Implement Redis query cache with 5-min TTL
Optimize JOIN query in getUsersForSearch
Add performance monitoring

Acceptance Criteria:

Search response time < 200ms for 95th percentile
Database query count reduced from 3 to 1 per search
Monitoring dashboard shows performance metrics
Load testing validates 10k concurrent users

Effort Estimate: M (1-2 days)

Code References:

user_service.go:245 - Main search function to optimize
user_repository.go:89 - Database query to modify
schema.sql:34 - Add index here

References

PostgreSQL index documentation: https://...
Existing Redis cache pattern: cache_service.go:12
Related performance ticket: PI-65432

Common Pitfalls

❌ Shallow Investigation

Bad:

Only considers 1 obvious solution
Vague references like "the user module"
No trade-off analysis

Good:

Explores 3-5 distinct approaches
Specific file:line references
Honest pros/cons for each

❌ Analysis Paralysis

Bad:

Explores 15 different approaches
Gets lost in theoretical possibilities
Never makes clear recommendation

Good:

Focus on 3-5 viable approaches
Make decision based on team context
Acknowledge uncertainty but recommend path

❌ Premature Implementation

Bad:

Starts writing code during SPIKE
Creates git worktree
Implements "prototype"

Good:

Investigation only
Code reading and references
Plan for implementation ticket

❌ Automatic Ticket Creation

Bad:

Creates 5 tickets without developer review
Breaks work into too many pieces
Doesn't get approval first

Good:

Proposes implementation plan
Waits for developer approval
Typically creates just 1 ticket

Time-Boxing

SPIKEs should be time-boxed to prevent over-analysis:

Small SPIKE: 2-4 hours
Medium SPIKE: 1 day
Large SPIKE: 2-3 days

If hitting time limit:

Document what you've learned so far
Document what's still unknown
Recommend either:
- Proceeding with current knowledge
- Extending SPIKE with specific questions
- Creating prototype SPIKE to validate approach

Success Criteria

A successful SPIKE:

✅ Thoroughly explores problem space
✅ Considers multiple approaches (3-5)
✅ Provides specific code references
✅ Makes clear recommendation with justification
✅ Creates actionable plan (typically 1 ticket)
✅ Gets developer approval before creating tickets
✅ Enables confident implementation

A successful SPIKE does NOT:

❌ Implement the solution
❌ Create code changes
❌ Create tickets without approval
❌ Leave implementation plan vague
❌ Only explore 1 obvious solution

9.6 KiB Raw Permalink Blame History

SPIKE Investigation Workflow

What is a SPIKE?

Key Principles

1. Exploration Over Implementation

2. Documentation Over Code

3. Developer Approval Required

4. No Code Changes

Investigation Process

Phase 1: Problem Understanding

Phase 2: Approach Exploration

Phase 3: Trade-off Analysis

Phase 4: Recommendation

Phase 5: Implementation Planning

Investigation Output Template

Example SPIKE Investigation

Problem

Approaches Considered

1. Database Query Optimization

2. Elasticsearch Integration

3. In-Memory Cache with Background Sync

4. Materialized View with Triggers

Recommendation

Proposed Implementation

References

Common Pitfalls

❌ Shallow Investigation

❌ Analysis Paralysis

❌ Premature Implementation

❌ Automatic Ticket Creation

Time-Boxing

Success Criteria

9.6 KiB

Raw Permalink Blame History