Skip to content
This repository was archived by the owner on May 20, 2026. It is now read-only.

Fix background summarization fallback gaps and improve summarization budget#4981

Merged
bhavyaus merged 2 commits into
mainfrom
dev/bhavyau/summarization-tests
Apr 5, 2026
Merged

Fix background summarization fallback gaps and improve summarization budget#4981
bhavyaus merged 2 commits into
mainfrom
dev/bhavyau/summarization-tests

Conversation

@bhavyaus
Copy link
Copy Markdown
Contributor

@bhavyaus bhavyaus commented Apr 5, 2026

Fixes: microsoft/vscode#299810

I need to clean this up and will do so in an updated pr for 116. But for now tracking refactor here

microsoft/vscode#307860


With Anthropic tool search enabled, summarization was getting double-penalized:

  1. The summarization endpoint was the reduced one ((baseBudget - toolTokens) * 0.9 = ~133k instead of 168k)
  2. All tool schemas were passed to the summarization LLM call, but since it uses ChatLocation.Other, none get defer_loading: true — all count fully

This left deferred tool tokens for conversation history in Full mode, causing frequent budget_exceeded → Simple mode fallback. After the fix, summarization gets 168k with only non-deferred tool schemas, leaving for history

…budget

- Record failure metadata on turn for all background compaction noResult
  paths, matching how foreground compaction tracks failures
- Add foreground summarization fallback when post-render background
  compaction produces no usable result at ≥95% context usage
- Use unreduced endpoint for both foreground and background summarization
  since they are separate LLM calls that don't share token space
- Filter deferred tool schemas from summarization prompt when Anthropic
  tool search is enabled to prevent budget_exceeded in Full mode
- Add Full→Simple fallback log and SimulateSummarizationError debug setting
- Deduplicate explanatory comments at summarization call sites
- Add tests for background summarizer noResult state machine paths
  and background failure metadata recording on turns
Copilot AI review requested due to automatic review settings April 5, 2026 01:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens the agent’s background/foreground summarization fallback behavior, increases the effective token budget used for separate summarization calls, and reduces summarization prompt bloat when Anthropic tool search is enabled by excluding deferred tool schemas.

Changes:

  • Records background-compaction “noResult” failures on the turn metadata and adds a post-render foreground fallback when background compaction produces no usable result at high context usage.
  • Runs summarization renders against the unreduced endpoint budget and filters summarization availableTools to non-deferred tools under Anthropic tool search.
  • Adds/extends unit tests for background summarizer noResult sequencing and metadata recording contracts.
Show a summary per file
File Description
src/extension/intents/node/agentIntent.ts Improves background compaction noResult handling, adds foreground fallback, uses full endpoint budget for summarization renders, and filters deferred tools from summarization context.
src/extension/prompts/node/agent/test/backgroundSummarizer.spec.ts Adds tests for failure/noResult and cancellation sequencing of BackgroundSummarizer.
src/extension/prompts/node/agent/test/summarization.spec.tsx Adds tests around summarization failure metadata contracts for background vs foreground paths.

Copilot's findings

  • Files reviewed: 3/3 changed files
  • Comments generated: 3

Comment thread src/extension/intents/node/agentIntent.ts
Comment thread src/extension/prompts/node/agent/test/backgroundSummarizer.spec.ts Outdated
Comment thread src/extension/prompts/node/agent/test/summarization.spec.tsx Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a token-budget regression in Agent background summarization when Anthropic tool search is enabled, preventing unnecessary budget_exceeded errors and reducing fallback to Simple summarization.

Changes:

  • Ensures summarization renders use the full (non-tool-reduced) endpoint budget and filter out deferred tool schemas when tool search is enabled.
  • Improves background compaction orchestration by recording “no result” outcomes and adding additional foreground fallback behavior in post-render blocking cases.
  • Adds unit tests to validate BackgroundSummarizer behavior for failure swallowing and cancellation races.
Show a summary per file
File Description
src/extension/prompts/node/agent/test/backgroundSummarizer.spec.ts Adds tests for failure/cancellation behavior and completion-wait semantics.
src/extension/intents/node/agentIntent.ts Adjusts endpoint/tool handling for summarization, improves fallback paths, and records background compaction failures.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

Comment thread src/extension/prompts/node/agent/test/backgroundSummarizer.spec.ts
- Remove unused endpoint parameter from _startBackgroundSummarization
- Remove duplicate backgroundSummarizer test (overlaps with existing)
- Clarify summarization.spec.tsx tests as contract tests
@bhavyaus bhavyaus force-pushed the dev/bhavyau/summarization-tests branch from f071121 to 8f0b7ec Compare April 5, 2026 01:42
@bhavyaus bhavyaus requested a review from Copilot April 5, 2026 01:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses background conversation summarization reliability and token-budget handling, particularly when Anthropic tool search is enabled, to reduce unnecessary budget_exceeded fallbacks and improve compaction behavior in the agent loop.

Changes:

  • Filters deferred tool schemas out of summarization prompt contexts when Anthropic tool search is enabled, improving available summarization budget.
  • Adds additional orchestration/fallback handling and telemetry/metadata recording when background compaction completes with no usable result.
  • Extends BackgroundSummarizer unit tests to cover failure + cancellation races around waitForCompletion().
Show a summary per file
File Description
src/extension/intents/node/agentIntent.ts Adjusts agent-loop background compaction/summarization paths (budget handling, tool filtering for summarization, and new failure recording).
src/extension/prompts/node/agent/test/backgroundSummarizer.spec.ts Adds tests for waitForCompletion() behavior on failures and cancellation races.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

Comment thread src/extension/intents/node/agentIntent.ts
@bhavyaus bhavyaus added this pull request to the merge queue Apr 5, 2026
Merged via the queue into main with commit 59434e4 Apr 5, 2026
23 checks passed
@bhavyaus bhavyaus deleted the dev/bhavyau/summarization-tests branch April 5, 2026 04:48
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GitHub Copilot context window meter accumulates ghost data and continuously forces compaction in v1.110+

3 participants