You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 20, 2026. It is now read-only.
Inline summarization: summarize within the agent loop for maximum prompt cache hits (#4956)
* Add inline summarization feature for agent conversation history
- Introduced configuration option for inline summarization in package.json and configurationService.ts.
- Updated agentIntent.ts to handle inline summarization logic during conversation.
- Modified summarizedConversationHistory.tsx to support inline summarization instructions.
- Enhanced tests to cover inline summarization scenarios and extraction of inline summaries.
* Remove cache-friendly summarization prompt and related configurations
* Refactor inline summarization handling in ToolCallingLoop and add summary application method
* Add failure telemetry, deferred cleanup, and debugName tracking for inline summarization
* Address PR review: fix empty string check, telemetry counts, cache token reporting, and test naming
Copy file name to clipboardExpand all lines: package.nls.json
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -391,7 +391,7 @@
391
391
"github.copilot.config.summarizeAgentConversationHistoryThreshold": "Threshold for compacting agent conversation history.",
392
392
"github.copilot.config.agentHistorySummarizationMode": "Mode for agent history summarization.",
393
393
"github.copilot.config.backgroundCompaction": "Enable background compaction of conversation history.",
394
-
"github.copilot.config.agentHistorySummarizationCacheFriendly": "Use a cache-friendly summarization prompt that shares the agent prefix for prompt cache hits.",
394
+
"github.copilot.config.agentHistorySummarizationInline": "Summarize conversation inline within the agent loop instead of a separate LLM call, maximizing prompt cache hits.",
395
395
396
396
"github.copilot.config.useResponsesApiTruncation": "Use Responses API for truncation.",
397
397
"github.copilot.config.enableReadFileV2": "Enable version 2 of the read file tool.",
0 commit comments