Two prompts — run /bug_fix to kick it off. Once you’ve implemented the logging and run the test case, paste the logs into the /analyze_logs prompt.
Enjoy! -Joe (@joenandez)
# **Systematic Debug Process**
**Your Instructions**: Complete all steps in order.
**Debug Analysis Required** - Do not implement fixes yet.
## **STEP 0: IMMEDIATE REPLY**
### **ACTIONS**
- Immediately reply to the user and request details on the bug or issue they are facing before proceeding to the next step.
Use the user's input from Step 0 for Step 1.
## **STEP 1: INVESTIGATE & ANALYZE**
### **ACTIONS**
1. **Understand the issue**
- What SHOULD happen (expected behavior)
- What ACTUALLY happens (current behavior)
- When/where this occurs (specific triggers, environments)
- Error messages (exact text, don't paraphrase) - if any exist
2. **Comprehensive code analysis**
- **Locate relevant files**: Search for components/functions related to the reported behavior
- **Examine the code path**: Trace through the logic that should handle the user's scenario
- **Understand data flow**: Follow how data moves through the system for this use case
- **Check existing tests**: Look for tests covering this functionality (may not exist)
- **Review recent changes**: Check commits/PRs/dependency bumps affecting this area
- **Map dependencies**: Note internal modules/services and external APIs involved
3. **Quick Fix Assessment**
Determine if this is a **QUICK FIX** by checking ALL criteria:
- ✅ Issue location is clearly identified (specific line/file)
- ✅ Root cause is obvious from code inspection alone
- ✅ Fix is a simple, low-risk change (typo, missing check, wrong variable, etc.)
- ✅ No complex logic or architectural changes needed
- ✅ 95%+ confidence the fix will resolve the issue
**CRITICAL**: If even ONE criterion fails → proceed to complex debugging strategy
4. **Generate hypotheses (if not a quick fix)**
Based on code analysis, consider:
- Last working state vs. current state (what changed?)
- Data type/null/undefined issues at transformation points
- Async/timing/race conditions
- Edge cases for the inputs observed
- Environmental differences (dev vs. prod)
- Dependencies/external service failures
- Classify by bug type: Logic / Integration / Data / Config / Race / Resource / Security / UI
5. **Prioritize top 2 hypotheses (if not a quick fix)**
For each top hypothesis, explain:
- Why this matches the symptoms
- Which step in the data flow would fail
- What assumption might be wrong
- Evidence supporting/against
- Confidence level (High/Medium/Low)
6. **Design strategy based on assessment**
- **If QUICK FIX**: Prepare clear fix recommendation (DO NOT implement)
- **If complex debugging needed**: Plan test creation + investigation approach
## **STEP 2: SHARE WITH THE USER**
### **ACTIONS**
Respond to the user in the following format:
**Issue Understanding:**
[Restate the problem showing you comprehend expected vs. actual behavior]
**Code Analysis Summary:**
[Summary of relevant files examined and key findings from code inspection]
**Existing Test Status:**
[Whether tests exist for this functionality, and if any are currently failing]
**Assessment: QUICK FIX or COMPLEX DEBUGGING NEEDED**
### **If QUICK FIX identified:**
**🎯 QUICK FIX RECOMMENDATION:**
**Root Cause:** [Specific issue found - e.g., "Missing null check on line 42 in UserProfile.tsx"]
**Proposed Fix:** [Exact change needed - e.g., "Add `if (!user) return null;` before line 43"]
**Confidence Level:** [High/Medium - explain why you're confident]
**Risk Assessment:** [Low/Medium - explain potential side effects]
**Would you like me to confirm this fix approach, or would you prefer debug logging to verify the root cause first?**
### **If COMPLEX DEBUGGING NEEDED:**
**All Hypotheses:**
[List 5-7 hypotheses based on code analysis]
**Top 2 Hypotheses - Detailed:**
- **Most Likely:**
- Why this is #1
- Evidence supporting/against
- Confidence level
- **Second Most Likely:**
- Why this is #2
- Evidence supporting/against
- Confidence level
**🔬 DEBUG STRATEGY:**
1. **Create failing test** to reproduce the issue
2. **Add targeted logging** to prove/disprove hypotheses
3. **Investigation approach** [describe specific steps]
## **STEP 3: IMPLEMENT TEST & DEBUG SETUP**
*(Only if COMPLEX DEBUGGING was chosen)*
### **ACTIONS**
**CRITICAL**: Only proceed with this step if the user chooses complex debugging over the quick fix recommendation.
Once you've responded to the user and they've chosen complex debugging:
1. **Create a failing test** that reproduces the reported behavior
- Write a minimal test that demonstrates expected vs. actual behavior
- Run the test to confirm it fails as expected
- Use appropriate test framework for the project
2. **Implement targeted debug logs**
- Format: `console.log('[🪳TEMP] checkpoint_name:', { data: value, type: typeof value, timestamp: Date.now() });`
- Place logs at key decision points identified in hypothesis analysis
- Log before/after each suspicious operation
- Ensure logs capture types, values, and timing
3. **Run test with debug output**
- Execute the failing test with debug logs
- Capture the output for analysis
**DO NOT IMPLEMENT ANY FIXES** - Only create test and add debug logging.
## **STEP 4: FINAL RESPONSE TO THE USER**
*(Only for COMPLEX DEBUGGING scenarios)*
### **ACTIONS**
**CRITICAL**: Only proceed with this step if test creation and debug logging was implemented in Step 3.
Respond to the user in the following format:
**Test Created:**
[Description of the failing test and what it reproduces]
**Test Execution:**
- Command: [`<exact command to run the test>`]
- Current Result: [fails as expected / passes unexpectedly / error]
- Output excerpt: [`<relevant test output>`]
**Debug Setup Complete:**
**Next Steps to Execute:**
1. [How to run the test with debug output]
2. [Where to look for debug logs]
3. [What to look for in the output]
**Expected Log Output if Hypothesis #1 is correct:**
[🪳TEMP] checkpoint_1: {data: {...}, type: 'object', timestamp: ...} [🪳TEMP] checkpoint_2: {data: null, type: 'object', timestamp: ...} <-- THIS CONFIRMS THE BUG
**Expected Log Output if Hypothesis #2 is correct:**
[🪳TEMP] checkpoint_1: {data: {...}, type: 'object', timestamp: ...} [🪳TEMP] checkpoint_2: {data: {...}, type: 'object', timestamp: ...} [🪳TEMP] checkpoint_3: [ERROR - Race condition, data arrived after render] <-- THIS CONFIRMS THE BUG
**If neither hypothesis confirms:** Return with the actual logs for deeper investigation.
**Debugging complete when:**
- Root cause is isolated via test + logs
- Clear fix approach is identified
- Plan for cleaning up `[🪳TEMP]` logs is set
---
## **IMPORTANT REMINDERS FOR ALL SCENARIOS**
- **Always start with comprehensive code analysis** before making assumptions
- **NEVER implement code fixes directly** - only provide recommendations or create debug setup
- **Quick fix recommendations** should be specific and actionable (no test needed)
- **Complex scenarios** require test creation + debug logging strategy
- **Test writing** is conditional on complexity assessment, not mandatory
- **Always confirm** with the user before proceeding with any code changes
**IMPORTANT**: DO NOT implement any fixes yet - first confirm your analysis and proposed approach.
Here are the debug logs from the [🪳TEMP] logging we added:
<logs>
$ARGUMENTS
</logs>
Analyze these logs carefully to identify the root cause.
## **If root cause is found:**
**Root Cause Identified:**
- **Confirmed Issue:** [...]
- **Why This Happens:** [...]
- **Why It Wasn’t Caught Earlier:** [...]
- **Impact Analysis:** [...]
**FIRST - Verify Understanding Through Code Analysis:**
- Examine the code around the log lines that revealed the issue
- Trace through the actual implementation at the failure point
- Check for:
- How the problematic values are created/transformed
- What assumptions the code makes
- Related functions that might be affected
- Edge cases the current code doesn't handle
- Confirm your root cause hypothesis by explaining exactly WHY the code fails given what the logs show
**BEFORE FIX - Add a Failing Test (RED):**
- Author a minimal failing test that reproduces the bug (unit preferred; integration if needed)
- Test should verify behavior, not temporary debug code like [🪳TEMP] logs
- Use inputs/signals from the logs to make the repro deterministic (control time/random/network; seed data; mock dependencies)
- Assert the current faulty behavior or the specific error observed in the logs
- Run the test and confirm it fails on current code (red) before any fix
- Name the test with the issue ID and reference the hypothesis/log lines
**THEN - Solution Assessment:**
- Propose 3-4 different solution approaches
- For each approach, evaluate:
- How it addresses the root cause
- Complexity and maintainability
- Performance implications
- Potential side effects or regressions
- Whether it fixes just symptoms or the underlying problem
- Narrow down to the 1-2 best solutions
- Provide detailed implementation plan for your recommended solution
**IMPORTANT**: DO NOT implement any fixes yet - first confirm your analysis and proposed approach.
**ACTION: Log Management for the fix:**
- **KEEP** [🪳TEMP] logs that helped identify the issue or would catch regressions
- **REMOVE** [🪳TEMP] logs that are now noise (e.g., confirmed working paths)
- **ADD** new [🪳TEMP] logs around the fix area to monitor the solution
- **ADD** [✅VERIFY] logs that will confirm the fix is working
- Explain why each log is kept/removed/added
**IMPORTANT**: DO NOT implement any fixes yet - first confirm your analysis and proposed approach.
Tell me exactly what output I should see in the remaining [🪳TEMP] and new [✅VERIFY] logs if the fix is successful.
**If multiple root causes are implied by the logs:**
- Prioritize by user impact and risk
- Stage fixes in logical order, validating each with targeted logs
- Reassess remaining symptoms after each stage before proceeding
## **If root cause is NOT clear:**
**ACTION: update your debugging strategy**
- Explain what the current logs revealed and what's still ambiguous
- **Design your updated debug strategy**
- Plan minimal [🪳TEMP] logs to prove/disprove your top hypotheses
- Log BEFORE and AFTER suspicious operations
- Capture types, not just values
- Include null/undefined checks
- **KEEP** [🪳TEMP] logs that provided useful info
- **REMOVE** [🪳TEMP] logs that created noise or confirmed working paths
- **ADD** 2-3 new targeted [🪳TEMP] logs based on what we learned
- Explain the reasoning for each keep/remove/add decision
- Specify exactly what scenarios/inputs I should test
- Tell me the exact steps to reproduce and capture these refined logs
**IMPORTANT**: DO NOT implement any fixes yet - first confirm your analysis and proposed approach.