9.6 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	SPIKE Investigation Workflow
What is a SPIKE?
A SPIKE ticket is a time-boxed research and investigation task. The goal is to explore a problem space, evaluate solution approaches, and create an actionable plan for implementation.
SPIKE = Investigation only. No code changes.
Key Principles
1. Exploration Over Implementation
- Focus on understanding the problem deeply
 - Consider multiple solution approaches (3-5)
 - Don't commit to first idea
 - Think creatively about alternatives
 
2. Documentation Over Code
- Document findings thoroughly
 - Provide specific code references (file:line)
 - Explain trade-offs objectively
 - Create actionable implementation plan
 
3. Developer Approval Required
- Always review findings with developer before creating tickets
 - Developer has final say on implementation approach
 - Get explicit approval before creating follow-up tickets
 - Typically results in just 1 follow-up ticket
 
4. No Code Changes
- ✅ Read and explore codebase
 - ✅ Document findings
 - ✅ Create implementation plan
 - ❌ Write implementation code
 - ❌ Create git worktree
 - ❌ Create PR
 
Investigation Process
Phase 1: Problem Understanding
Understand current state:
- Read ticket description thoroughly
 - Explore relevant codebase areas
 - Identify constraints and dependencies
 - Document current implementation
 
Ask questions:
- What problem are we solving?
 - Who is affected?
 - What are the constraints?
 - What's the desired outcome?
 
Phase 2: Approach Exploration
Explore 3-5 different approaches:
For each approach, document:
- Name: Brief descriptive name
 - Description: How it works
 - Pros: Benefits and advantages
 - Cons: Drawbacks and challenges
 - Effort: Relative complexity (S/M/L/XL)
 - Code locations: Specific file:line references
 
Think broadly:
- Conventional approaches
 - Creative/unconventional approaches
 - Simple vs. complex solutions
 - Short-term vs. long-term solutions
 
Phase 3: Trade-off Analysis
Evaluate objectively:
- Implementation complexity
 - Performance implications
 - Maintenance burden
 - Testing requirements
 - Migration/rollout complexity
 - Team familiarity with approach
 - Long-term sustainability
 
Be honest about cons:
- Every approach has trade-offs
 - Document them clearly
 - Don't hide problems
 
Phase 4: Recommendation
Make clear recommendation:
- Which approach is best
 - Why it's superior to alternatives
 - Key risks and mitigations
 - Confidence level (Low/Medium/High)
 
Justify recommendation:
- Reference specific trade-offs
 - Explain why pros outweigh cons
 - Consider team context
 
Phase 5: Implementation Planning
Create actionable plan:
- Typically breaks down into 1 follow-up ticket
 - Occasionally 2-3 if clearly independent tasks
 - Never many vague tickets
 
For each ticket, include:
- Clear summary
 - Detailed description
 - Recommended approach
 - Acceptance criteria
 - Code references from investigation
 - Effort estimate (S/M/L/XL)
 
Investigation Output Template
## Investigation Findings - PI-XXXXX
### Problem Analysis
[Current state description with file:line references]
[Problem statement]
[Constraints and requirements]
### Approaches Considered
#### 1. [Approach Name]
- **Description**: [How it works]
- **Pros**:
  - [Benefit 1]
  - [Benefit 2]
- **Cons**:
  - [Drawback 1]
  - [Drawback 2]
- **Effort**: [S/M/L/XL]
- **Code**: [file.ext:123, file.ext:456]
#### 2. [Approach Name]
[Repeat structure for each approach]
[Continue for 3-5 approaches]
### Recommendation
**Recommended Approach**: [Approach Name]
**Justification**: [Why this is best, referencing specific trade-offs]
**Risks**:
- [Risk 1]: [Mitigation]
- [Risk 2]: [Mitigation]
**Confidence**: [Low/Medium/High]
### Proposed Implementation
Typically **1 follow-up ticket**:
**Summary**: [Concise task description]
**Description**:
[Detailed implementation plan]
[Step-by-step approach]
[Key considerations]
**Acceptance Criteria**:
- [ ] [Criterion 1]
- [ ] [Criterion 2]
- [ ] [Criterion 3]
**Effort Estimate**: [S/M/L/XL]
**Code References**:
- [file.ext:123 - Description]
- [file.ext:456 - Description]
### References
- [Documentation link]
- [Related ticket]
- [External resource]
Example SPIKE Investigation
Problem
Performance degradation in user search with large datasets (10k+ users)
Approaches Considered
1. Database Query Optimization
- Description: Add indexes, optimize JOIN queries, use query caching
 - Pros:
- Minimal code changes
 - Works with existing architecture
 - Can be implemented incrementally
 
 - Cons:
- Limited scalability (still hits DB for each search)
 - Query complexity increases with features
 - Cache invalidation complexity
 
 - Effort: M
 - Code: user_service.go:245, user_repository.go:89
 
2. Elasticsearch Integration
- Description: Index users in Elasticsearch, use for all search operations
 - Pros:
- Excellent search performance at scale
 - Full-text search capabilities
 - Faceted search support
 
 - Cons:
- New infrastructure to maintain
 - Data sync complexity
 - Team learning curve
 - Higher operational cost
 
 - Effort: XL
 - Code: Would be new service, interfaces at user_service.go:200
 
3. In-Memory Cache with Background Sync
- Description: Maintain searchable user cache in memory, sync periodically
 - Pros:
- Very fast search performance
 - No additional infrastructure
 - Simple implementation
 
 - Cons:
- Memory usage on app servers
 - Eventual consistency issues
 - Cache warming on deploy
 - Doesn't scale past single-server memory
 
 - Effort: L
 - Code: New cache_service.go, integrate at user_service.go:245
 
4. Materialized View with Triggers
- Description: Database materialized view optimized for search, auto-updated via triggers
 - Pros:
- Good performance
 - Consistent data
 - Minimal app code changes
 
 - Cons:
- Database-specific (PostgreSQL only)
 - Trigger complexity
 - Harder to debug issues
 - Lock contention on high write volume
 
 - Effort: M
 - Code: Migration needed, user_repository.go:89
 
Recommendation
Recommended Approach: Database Query Optimization (#1)
Justification: Given our current scale (8k users, growing ~20%/year) and team context:
- Elasticsearch is over-engineering for current needs - reaches 50k users in ~5 years
 - In-memory cache has consistency issues that would affect UX
 - Materialized views add database complexity our team hasn't worked with
 - Query optimization addresses immediate pain point with minimal risk
 - Can revisit Elasticsearch if we hit 20k+ users or need full-text features
 
Risks:
- May need to revisit in 2-3 years if growth accelerates: Monitor performance metrics, set alert at 15k users
 - Won't support advanced search features: Document limitation, plan for future if needed
 
Confidence: High
Proposed Implementation
1 follow-up ticket:
Summary: Optimize user search queries with indexes and caching
Description:
- Add composite index on (last_name, first_name, email)
 - Implement Redis query cache with 5-min TTL
 - Optimize JOIN query in getUsersForSearch
 - Add performance monitoring
 
Acceptance Criteria:
- Search response time < 200ms for 95th percentile
 - Database query count reduced from 3 to 1 per search
 - Monitoring dashboard shows performance metrics
 - Load testing validates 10k concurrent users
 
Effort Estimate: M (1-2 days)
Code References:
- user_service.go:245 - Main search function to optimize
 - user_repository.go:89 - Database query to modify
 - schema.sql:34 - Add index here
 
References
- PostgreSQL index documentation: https://...
 - Existing Redis cache pattern: cache_service.go:12
 - Related performance ticket: PI-65432
 
Common Pitfalls
❌ Shallow Investigation
Bad:
- Only considers 1 obvious solution
 - Vague references like "the user module"
 - No trade-off analysis
 
Good:
- Explores 3-5 distinct approaches
 - Specific file:line references
 - Honest pros/cons for each
 
❌ Analysis Paralysis
Bad:
- Explores 15 different approaches
 - Gets lost in theoretical possibilities
 - Never makes clear recommendation
 
Good:
- Focus on 3-5 viable approaches
 - Make decision based on team context
 - Acknowledge uncertainty but recommend path
 
❌ Premature Implementation
Bad:
- Starts writing code during SPIKE
 - Creates git worktree
 - Implements "prototype"
 
Good:
- Investigation only
 - Code reading and references
 - Plan for implementation ticket
 
❌ Automatic Ticket Creation
Bad:
- Creates 5 tickets without developer review
 - Breaks work into too many pieces
 - Doesn't get approval first
 
Good:
- Proposes implementation plan
 - Waits for developer approval
 - Typically creates just 1 ticket
 
Time-Boxing
SPIKEs should be time-boxed to prevent over-analysis:
- Small SPIKE: 2-4 hours
 - Medium SPIKE: 1 day
 - Large SPIKE: 2-3 days
 
If hitting time limit:
- Document what you've learned so far
 - Document what's still unknown
 - Recommend either:
- Proceeding with current knowledge
 - Extending SPIKE with specific questions
 - Creating prototype SPIKE to validate approach
 
 
Success Criteria
A successful SPIKE:
- ✅ Thoroughly explores problem space
 - ✅ Considers multiple approaches (3-5)
 - ✅ Provides specific code references
 - ✅ Makes clear recommendation with justification
 - ✅ Creates actionable plan (typically 1 ticket)
 - ✅ Gets developer approval before creating tickets
 - ✅ Enables confident implementation
 
A successful SPIKE does NOT:
- ❌ Implement the solution
 - ❌ Create code changes
 - ❌ Create tickets without approval
 - ❌ Leave implementation plan vague
 - ❌ Only explore 1 obvious solution