feat: Overnight Agent Report Improvements

Overnight Agent Report Improvements

Overview

Three improvements based on the first test run: (1) flat research trace log, (2) enforce questions and next steps in every report, (3) duration-adaptive exploration with branch budgets.

Problem Statement

The 20-minute test run produced 3 solid findings but: zero questions logged, no “next steps” section, no visibility into what the agent searched, and the 60-min synthesis buffer makes short runs mathematically broken.

Changes (2 files)

.claude/skills/overnight/SKILL.md

  1. Flat trace log. After each search/visit/decision, append a one-liner to trace.md. Chronological, no nesting, no parent pointers. Survives crashes (append-only file). Included in report as-is.

  2. Mandatory report sections. Synthesis Mode now has a checklist — agent must write all sections before creating DONE. Questions and Next Directions sections “MUST NOT be empty.” No numeric minimums — emphasis over quotas.

  3. Proportional synthesis buffer. Replaced fixed 60-minute rule with: 25% of total duration, minimum 5 minutes. Agent reads branch budget from CLAUDE.md instead of pacing by wall-clock time.

scripts/overnight-launch.sh

  1. Branch budget in CLAUDE.md. Launch script calculates and writes concrete branch counts based on duration: 1-2 for 20min, 3-6 for 1-2hr, 2-3 per hour for longer runs. Also writes synthesis buffer in minutes.

What was dropped (per reviewer feedback)

  • HTML report generation — agent should think during synthesis, not write CSS. Use pandoc later if needed.
  • Nested trace tree with parent pointers — too fragile for an LLM to maintain. Flat log gives 90% of the value.
  • Numeric minimums (min 3 questions, 1-per-branch) — produces filler. Emphasis works better.
  • Time-based pacing (~2 branches/hour) — agent can’t track wall-clock time mid-execution. Branch count budgets instead.

Acceptance Criteria

  • Trace survives crashes (append-only trace.md, written to disk after each action)
  • report.md always contains: summary, findings, trace, questions (non-empty), next directions (non-empty), sources
  • 20-minute runs complete a full explore→synthesize→report cycle
  • Synthesis buffer is proportional: max(5 min, 25% of duration)

Test Plan

  • Run 20-minute test: verify findings + trace + questions + next steps
  • Verify trace.md has entries after each search action
  • Verify report.md has all 6 required sections with non-empty content
  • Stop mid-run and restart: verify trace.md survived and agent continues appending