Fix background summarization fallback gaps and improve summarization budget#4981
Conversation
…budget - Record failure metadata on turn for all background compaction noResult paths, matching how foreground compaction tracks failures - Add foreground summarization fallback when post-render background compaction produces no usable result at ≥95% context usage - Use unreduced endpoint for both foreground and background summarization since they are separate LLM calls that don't share token space - Filter deferred tool schemas from summarization prompt when Anthropic tool search is enabled to prevent budget_exceeded in Full mode - Add Full→Simple fallback log and SimulateSummarizationError debug setting - Deduplicate explanatory comments at summarization call sites - Add tests for background summarizer noResult state machine paths and background failure metadata recording on turns
There was a problem hiding this comment.
Pull request overview
This PR tightens the agent’s background/foreground summarization fallback behavior, increases the effective token budget used for separate summarization calls, and reduces summarization prompt bloat when Anthropic tool search is enabled by excluding deferred tool schemas.
Changes:
- Records background-compaction “noResult” failures on the turn metadata and adds a post-render foreground fallback when background compaction produces no usable result at high context usage.
- Runs summarization renders against the unreduced endpoint budget and filters summarization
availableToolsto non-deferred tools under Anthropic tool search. - Adds/extends unit tests for background summarizer noResult sequencing and metadata recording contracts.
Show a summary per file
| File | Description |
|---|---|
| src/extension/intents/node/agentIntent.ts | Improves background compaction noResult handling, adds foreground fallback, uses full endpoint budget for summarization renders, and filters deferred tools from summarization context. |
| src/extension/prompts/node/agent/test/backgroundSummarizer.spec.ts | Adds tests for failure/noResult and cancellation sequencing of BackgroundSummarizer. |
| src/extension/prompts/node/agent/test/summarization.spec.tsx | Adds tests around summarization failure metadata contracts for background vs foreground paths. |
Copilot's findings
- Files reviewed: 3/3 changed files
- Comments generated: 3
c8cba52 to
f071121
Compare
There was a problem hiding this comment.
Pull request overview
This PR fixes a token-budget regression in Agent background summarization when Anthropic tool search is enabled, preventing unnecessary budget_exceeded errors and reducing fallback to Simple summarization.
Changes:
- Ensures summarization renders use the full (non-tool-reduced) endpoint budget and filter out deferred tool schemas when tool search is enabled.
- Improves background compaction orchestration by recording “no result” outcomes and adding additional foreground fallback behavior in post-render blocking cases.
- Adds unit tests to validate
BackgroundSummarizerbehavior for failure swallowing and cancellation races.
Show a summary per file
| File | Description |
|---|---|
| src/extension/prompts/node/agent/test/backgroundSummarizer.spec.ts | Adds tests for failure/cancellation behavior and completion-wait semantics. |
| src/extension/intents/node/agentIntent.ts | Adjusts endpoint/tool handling for summarization, improves fallback paths, and records background compaction failures. |
Copilot's findings
- Files reviewed: 2/2 changed files
- Comments generated: 1
- Remove unused endpoint parameter from _startBackgroundSummarization - Remove duplicate backgroundSummarizer test (overlaps with existing) - Clarify summarization.spec.tsx tests as contract tests
f071121 to
8f0b7ec
Compare
There was a problem hiding this comment.
Pull request overview
This PR addresses background conversation summarization reliability and token-budget handling, particularly when Anthropic tool search is enabled, to reduce unnecessary budget_exceeded fallbacks and improve compaction behavior in the agent loop.
Changes:
- Filters deferred tool schemas out of summarization prompt contexts when Anthropic tool search is enabled, improving available summarization budget.
- Adds additional orchestration/fallback handling and telemetry/metadata recording when background compaction completes with no usable result.
- Extends
BackgroundSummarizerunit tests to cover failure + cancellation races aroundwaitForCompletion().
Show a summary per file
| File | Description |
|---|---|
| src/extension/intents/node/agentIntent.ts | Adjusts agent-loop background compaction/summarization paths (budget handling, tool filtering for summarization, and new failure recording). |
| src/extension/prompts/node/agent/test/backgroundSummarizer.spec.ts | Adds tests for waitForCompletion() behavior on failures and cancellation races. |
Copilot's findings
- Files reviewed: 2/2 changed files
- Comments generated: 1
Fixes: microsoft/vscode#299810
I need to clean this up and will do so in an updated pr for 116. But for now tracking refactor here
microsoft/vscode#307860
With Anthropic tool search enabled, summarization was getting double-penalized:
(baseBudget - toolTokens) * 0.9 = ~133kinstead of168k)ChatLocation.Other, none getdefer_loading: true— all count fullyThis left deferred tool tokens for conversation history in Full mode, causing frequent
budget_exceeded→ Simple mode fallback. After the fix, summarization gets 168k with only non-deferred tool schemas, leaving for history