# SPIKE Investigation Workflow ## What is a SPIKE? A SPIKE ticket is a time-boxed research and investigation task. The goal is to explore a problem space, evaluate solution approaches, and create an actionable plan for implementation. **SPIKE = Investigation only. No code changes.** ## Key Principles ### 1. Exploration Over Implementation - Focus on understanding the problem deeply - Consider multiple solution approaches (3-5) - Don't commit to first idea - Think creatively about alternatives ### 2. Documentation Over Code - Document findings thoroughly - Provide specific code references (file:line) - Explain trade-offs objectively - Create actionable implementation plan ### 3. Developer Approval Required - Always review findings with developer before creating tickets - Developer has final say on implementation approach - Get explicit approval before creating follow-up tickets - Typically results in just 1 follow-up ticket ### 4. No Code Changes - ✅ Read and explore codebase - ✅ Document findings - ✅ Create implementation plan - ❌ Write implementation code - ❌ Create git worktree - ❌ Create PR ## Investigation Process ### Phase 1: Problem Understanding **Understand current state:** - Read ticket description thoroughly - Explore relevant codebase areas - Identify constraints and dependencies - Document current implementation **Ask questions:** - What problem are we solving? - Who is affected? - What are the constraints? - What's the desired outcome? ### Phase 2: Approach Exploration **Explore 3-5 different approaches:** For each approach, document: - **Name**: Brief descriptive name - **Description**: How it works - **Pros**: Benefits and advantages - **Cons**: Drawbacks and challenges - **Effort**: Relative complexity (S/M/L/XL) - **Code locations**: Specific file:line references **Think broadly:** - Conventional approaches - Creative/unconventional approaches - Simple vs. complex solutions - Short-term vs. long-term solutions ### Phase 3: Trade-off Analysis **Evaluate objectively:** - Implementation complexity - Performance implications - Maintenance burden - Testing requirements - Migration/rollout complexity - Team familiarity with approach - Long-term sustainability **Be honest about cons:** - Every approach has trade-offs - Document them clearly - Don't hide problems ### Phase 4: Recommendation **Make clear recommendation:** - Which approach is best - Why it's superior to alternatives - Key risks and mitigations - Confidence level (Low/Medium/High) **Justify recommendation:** - Reference specific trade-offs - Explain why pros outweigh cons - Consider team context ### Phase 5: Implementation Planning **Create actionable plan:** - Typically breaks down into **1 follow-up ticket** - Occasionally 2-3 if clearly independent tasks - Never many vague tickets **For each ticket, include:** - Clear summary - Detailed description - Recommended approach - Acceptance criteria - Code references from investigation - Effort estimate (S/M/L/XL) ## Investigation Output Template ```markdown ## Investigation Findings - PI-XXXXX ### Problem Analysis [Current state description with file:line references] [Problem statement] [Constraints and requirements] ### Approaches Considered #### 1. [Approach Name] - **Description**: [How it works] - **Pros**: - [Benefit 1] - [Benefit 2] - **Cons**: - [Drawback 1] - [Drawback 2] - **Effort**: [S/M/L/XL] - **Code**: [file.ext:123, file.ext:456] #### 2. [Approach Name] [Repeat structure for each approach] [Continue for 3-5 approaches] ### Recommendation **Recommended Approach**: [Approach Name] **Justification**: [Why this is best, referencing specific trade-offs] **Risks**: - [Risk 1]: [Mitigation] - [Risk 2]: [Mitigation] **Confidence**: [Low/Medium/High] ### Proposed Implementation Typically **1 follow-up ticket**: **Summary**: [Concise task description] **Description**: [Detailed implementation plan] [Step-by-step approach] [Key considerations] **Acceptance Criteria**: - [ ] [Criterion 1] - [ ] [Criterion 2] - [ ] [Criterion 3] **Effort Estimate**: [S/M/L/XL] **Code References**: - [file.ext:123 - Description] - [file.ext:456 - Description] ### References - [Documentation link] - [Related ticket] - [External resource] ``` ## Example SPIKE Investigation ### Problem Performance degradation in user search with large datasets (10k+ users) ### Approaches Considered #### 1. Database Query Optimization - **Description**: Add indexes, optimize JOIN queries, use query caching - **Pros**: - Minimal code changes - Works with existing architecture - Can be implemented incrementally - **Cons**: - Limited scalability (still hits DB for each search) - Query complexity increases with features - Cache invalidation complexity - **Effort**: M - **Code**: user_service.go:245, user_repository.go:89 #### 2. Elasticsearch Integration - **Description**: Index users in Elasticsearch, use for all search operations - **Pros**: - Excellent search performance at scale - Full-text search capabilities - Faceted search support - **Cons**: - New infrastructure to maintain - Data sync complexity - Team learning curve - Higher operational cost - **Effort**: XL - **Code**: Would be new service, interfaces at user_service.go:200 #### 3. In-Memory Cache with Background Sync - **Description**: Maintain searchable user cache in memory, sync periodically - **Pros**: - Very fast search performance - No additional infrastructure - Simple implementation - **Cons**: - Memory usage on app servers - Eventual consistency issues - Cache warming on deploy - Doesn't scale past single-server memory - **Effort**: L - **Code**: New cache_service.go, integrate at user_service.go:245 #### 4. Materialized View with Triggers - **Description**: Database materialized view optimized for search, auto-updated via triggers - **Pros**: - Good performance - Consistent data - Minimal app code changes - **Cons**: - Database-specific (PostgreSQL only) - Trigger complexity - Harder to debug issues - Lock contention on high write volume - **Effort**: M - **Code**: Migration needed, user_repository.go:89 ### Recommendation **Recommended Approach**: Database Query Optimization (#1) **Justification**: Given our current scale (8k users, growing ~20%/year) and team context: - Elasticsearch is over-engineering for current needs - reaches 50k users in ~5 years - In-memory cache has consistency issues that would affect UX - Materialized views add database complexity our team hasn't worked with - Query optimization addresses immediate pain point with minimal risk - Can revisit Elasticsearch if we hit 20k+ users or need full-text features **Risks**: - May need to revisit in 2-3 years if growth accelerates: Monitor performance metrics, set alert at 15k users - Won't support advanced search features: Document limitation, plan for future if needed **Confidence**: High ### Proposed Implementation **1 follow-up ticket**: **Summary**: Optimize user search queries with indexes and caching **Description**: 1. Add composite index on (last_name, first_name, email) 2. Implement Redis query cache with 5-min TTL 3. Optimize JOIN query in getUsersForSearch 4. Add performance monitoring **Acceptance Criteria**: - [ ] Search response time < 200ms for 95th percentile - [ ] Database query count reduced from 3 to 1 per search - [ ] Monitoring dashboard shows performance metrics - [ ] Load testing validates 10k concurrent users **Effort Estimate**: M (1-2 days) **Code References**: - user_service.go:245 - Main search function to optimize - user_repository.go:89 - Database query to modify - schema.sql:34 - Add index here ### References - PostgreSQL index documentation: https://... - Existing Redis cache pattern: cache_service.go:12 - Related performance ticket: PI-65432 ## Common Pitfalls ### ❌ Shallow Investigation **Bad**: - Only considers 1 obvious solution - Vague references like "the user module" - No trade-off analysis **Good**: - Explores 3-5 distinct approaches - Specific file:line references - Honest pros/cons for each ### ❌ Analysis Paralysis **Bad**: - Explores 15 different approaches - Gets lost in theoretical possibilities - Never makes clear recommendation **Good**: - Focus on 3-5 viable approaches - Make decision based on team context - Acknowledge uncertainty but recommend path ### ❌ Premature Implementation **Bad**: - Starts writing code during SPIKE - Creates git worktree - Implements "prototype" **Good**: - Investigation only - Code reading and references - Plan for implementation ticket ### ❌ Automatic Ticket Creation **Bad**: - Creates 5 tickets without developer review - Breaks work into too many pieces - Doesn't get approval first **Good**: - Proposes implementation plan - Waits for developer approval - Typically creates just 1 ticket ## Time-Boxing SPIKEs should be time-boxed to prevent over-analysis: - **Small SPIKE**: 2-4 hours - **Medium SPIKE**: 1 day - **Large SPIKE**: 2-3 days If hitting time limit: 1. Document what you've learned so far 2. Document what's still unknown 3. Recommend either: - Proceeding with current knowledge - Extending SPIKE with specific questions - Creating prototype SPIKE to validate approach ## Success Criteria A successful SPIKE: - ✅ Thoroughly explores problem space - ✅ Considers multiple approaches (3-5) - ✅ Provides specific code references - ✅ Makes clear recommendation with justification - ✅ Creates actionable plan (typically 1 ticket) - ✅ Gets developer approval before creating tickets - ✅ Enables confident implementation A successful SPIKE does NOT: - ❌ Implement the solution - ❌ Create code changes - ❌ Create tickets without approval - ❌ Leave implementation plan vague - ❌ Only explore 1 obvious solution