Improve token caching. by jsourcebot · Pull Request #1366 · sourcebot-dev/sourcebot

jsourcebot · 2026-06-23T18:54:52Z

Improved Ask Sourcebot prompt caching by splitting static and dynamic prompt sections and advancing cache breakpoints after every agent step instead of only after each message

Summary by CodeRabbit

New Features
- Enhanced Enterprise “Ask Sourcebot” prompt caching with provider-aware strategies, separating byte-stable static vs dynamic prompt content.
- Added server controls to enable caching, set static TTL, and optionally detect cache misses/breakpoints.
- Cache breakpoints now advance after each agent step (not just per message).
Bug Fixes
- Stabilized prompt/tool byte layout by making MCP tool/server and repository ordering deterministic.
Tests
- Expanded coverage for cache markers, TTL behavior, no-op providers, and static prompt byte identity across repo selections.
Chores
- Updated changelog to document the improved caching behavior.

coderabbitai · 2026-06-23T18:55:09Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds a PromptCacheStrategy abstraction to the EE chat agent that separates static and dynamic prompt sections, enforces deterministic byte ordering in MCP tool/client queries, and attaches Anthropic ephemeral cache-control markers to the static system block and per-step tail message. The strategy is wired through all chat entry points via three new env flags.

Changes

EE Ask Sourcebot Prompt Caching

Layer / File(s)	Summary
PromptCacheStrategy module and env flags `packages/shared/src/env.server.ts`, `packages/web/src/ee/features/chat/promptCaching.ts`	Defines `CacheTtl`, `PromptCacheStrategy` interface, no-op and Anthropic ephemeral strategies, `getPromptCacheStrategy`, `mergeProviderOptions`, `detectPromptCacheBreak`, and `detectUnexpectedCacheMiss`. Adds three env schema fields for static prefix enable, TTL selection (`5m`/`1h`), and break-detection toggle.
Deterministic MCP and tool ordering `packages/web/src/ee/features/chat/mcp/mcpClientFactory.ts`, `packages/web/src/ee/features/chat/mcp/mcpToolRegistry.ts`	Adds `orderBy: { serverId: 'asc' }` to the `getConnectedMcpClients` Prisma query. Sorts tool entries by name in `buildMcpToolRegistry` before mapping to stabilize byte layout across requests.
Agent prompt split and caching wiring `packages/web/src/ee/features/chat/agent.ts`	Extends `CreateMessageStreamResponseProps` and `AgentOptions` with `promptCacheStrategy`. Sorts `selectedRepos` for byte stability. Refactors `createPrompt` into `staticPrompt`/`dynamicPrompt` via a `dynamicSections` array. Implements static-prefix mode with `activationToolMarker`, applies `tailMarker` per step in `prepareStep`, and gates observability helpers behind env flags.
Strategy wiring at chat entry points `packages/web/src/app/api/(server)/ee/chat/route.ts`, `packages/web/src/ee/features/mcp/askCodebase.ts`	Computes `promptCacheStrategy` from the selected provider and `SOURCEBOT_CHAT_PROMPT_CACHING_ENABLED` flag in both `chat/route.ts` and `askCodebase.ts`, then passes it into `createMessageStream`.
Tests and changelog `packages/web/src/ee/features/chat/promptCaching.test.ts`, `packages/web/src/ee/features/chat/agent.test.ts`, `CHANGELOG.md`	Adds `promptCaching.test.ts` covering strategy and merge behavior across providers and TTL variants. Extends `agent.test.ts` with a full caching suite asserting static-block markers, per-step tail-marker relocation, non-Anthropic no-ops, tool marker presence, and static prompt byte identity. Updates changelog.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant ChatRoute as chat/route.ts
  participant Agent as createAgentStream
  participant Prompt as createPrompt
  participant CacheStrategy as PromptCacheStrategy
  participant StreamText as streamText (AI SDK)

  Client->>ChatRoute: POST /api/ee/chat
  ChatRoute->>CacheStrategy: getPromptCacheStrategy(provider, enabled)
  CacheStrategy-->>ChatRoute: strategy (Anthropic or no-op)
  ChatRoute->>Agent: createMessageStream({ promptCacheStrategy })
  Agent->>Prompt: createPrompt(sortedRepos)
  Prompt-->>Agent: { staticPrompt, dynamicPrompt }
  Agent->>CacheStrategy: strategy.cacheControl({ ttl: staticTtl })
  CacheStrategy-->>Agent: staticMarker (providerOptions)
  Agent->>StreamText: systemMessages[0] with staticMarker + dynamicPrompt
  loop each agent step
    StreamText->>Agent: prepareStep(stepMessages)
    Agent->>Agent: move tailMarker onto last message
    Agent-->>StreamText: messages with tailMarker applied
    StreamText-->>Agent: stepResult { cacheReadTokens }
    Agent->>Agent: detectUnexpectedCacheMiss(stepIndex, cacheReadTokens)
  end
  StreamText-->>Client: streamed response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

sourcebot-dev/sourcebot#1278: Implements the initial Anthropic prompt caching by wiring providerOptions.anthropic.cacheControl into the chat agent's streamText call and adding token-cache metrics to the UI — the direct predecessor to this PR's strategy abstraction and break-detection layer.

Suggested reviewers

brendan-kellam

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title "Improve token caching" is vague and generic, using non-specific language that doesn't convey the specific changes made to prompt caching architecture.	Consider a more descriptive title that captures the key changes, such as "Split static/dynamic prompt sections and advance cache breakpoints per agent step" or "Improve prompt caching with static prefix separation and per-step breakpoints".

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch jminnetian/improve-token-caching

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CHANGELOG.md`:
- Line 11: In the CHANGELOG.md file, locate the line containing the Ask
Sourcebot prompt caching improvement entry under the [Unreleased] section.
Replace both instances of the placeholder `<id>` in the markdown link reference
`[#<id>](https://github.com/sourcebot-dev/sourcebot/pull/<id>)` with the actual
pull request number for this change. The same numeric PR id should be used in
both the link text and the URL to create a valid GitHub PR link.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 70bb4df8-9bdb-46d6-994c-b4004e501537

📥 Commits

Reviewing files that changed from the base of the PR and between 889e2b1 and 05e306d.

📒 Files selected for processing (10)

CHANGELOG.md
packages/shared/src/env.server.ts
packages/web/src/app/api/(server)/ee/chat/route.ts
packages/web/src/ee/features/chat/agent.test.ts
packages/web/src/ee/features/chat/agent.ts
packages/web/src/ee/features/chat/mcp/mcpClientFactory.ts
packages/web/src/ee/features/chat/mcp/mcpToolRegistry.ts
packages/web/src/ee/features/chat/promptCaching.test.ts
packages/web/src/ee/features/chat/promptCaching.ts
packages/web/src/ee/features/mcp/askCodebase.ts

Adds a divergence-proof static front checkpoint (cross-chat reuse of tool + static-system bytes) and an MCP activation-resilience breakpoint on top of the existing moving tail marker, behind a provider-aware resolver that is a no-op for non-Anthropic providers. Splits the system prompt into static/dynamic blocks and hardens MCP ordering for byte stability, all gated by new env flags.

The moving tail breakpoint was set once on the last input message before streamText's loop, so a turn's tool calls and outputs accumulated past it and were reprocessed uncached on each later step. Apply it in prepareStep to the last message of every step instead, caching the growing in-turn delta incrementally. prepareStep now runs without MCP too, and stays a no-op for non-Anthropic providers.

…ignature cacheBreakSnapshots was keyed by chatId and never evicted, so with cache-break detection enabled it grew with the cumulative number of distinct chats served. Add a FIFO cap that drops the oldest entry on overflow, and replace the hand-rolled djb2 signature hash with a sha256 slice matching getOAuthScopeHash (observability-only and compared in-process, so determinism is all it needs).

Marker 1 only saved re-writing the built-in tool schemas on mid-turn MCP activation steps, and only when those schemas cleared the model's minimum cacheable size. The static-system checkpoint and moving tail carry the value, so this collapses the scheme to two breakpoints and removes the activeTools insertion-order reasoning it required.

Remove stale references to the dropped tools-block breakpoint and tighten verbose prompt-caching comments. Comments only, no code changes.

brendan-kellam · 2026-06-24T01:52:06Z


        SOURCEBOT_CHAT_MAX_STEP_COUNT: numberSchema.default(100),
        SOURCEBOT_CHAT_PROMPT_CACHING_ENABLED: booleanSchema.default('true'),
+        // Phased-rollout lever for the static checkpoint. Set to 'false' to fall


nit: can we use /** **/ comments s.t., we get inline JSDoc rendering when hovering over these symbols in the ide?

brendan-kellam · 2026-06-24T02:12:15Z

+        // Phased-rollout lever for the static checkpoint. Set to 'false' to fall
+        // back to the single moving tail marker. Only takes effect when prompt
+        // caching is enabled.
+        SOURCEBOT_CHAT_PROMPT_CACHE_STATIC_PREFIX_ENABLED: booleanSchema.default('true'),


Is there advantage of making this configurable?

Mainly a safe guard in case some issue makes the static portion not cache properly, you can disable it so you stop paying the extra cost for no benefit. But it's maybe overly defensive, leaning towards removing it and keeping the env var to a minimal required set.

brendan-kellam · 2026-06-24T02:12:16Z

+
+const logger = createLogger('prompt-caching');
+
+export type CacheTtl = '5m' | '1h';


can the TTL only be 5m or 1h?

Yes, at least for anthropic

Remove the SOURCEBOT_CHAT_PROMPT_CACHE_STATIC_PREFIX_ENABLED lever so the static checkpoint is always emitted, and switch the remaining cache env vars to JSDoc comments so their descriptions render on IDE hover.

coderabbitai Bot reviewed Jun 23, 2026

View reviewed changes

Comment thread CHANGELOG.md Outdated

jsourcebot added 7 commits June 23, 2026 18:20

docs: update changelog for prompt caching

f024e8c

clean up comments

b8c8b23

Remove stale references to the dropped tools-block breakpoint and tighten verbose prompt-caching comments. Comments only, no code changes.

trim comments further

8610781

jsourcebot force-pushed the jminnetian/improve-token-caching branch from 7de1198 to 8610781 Compare June 24, 2026 01:24

brendan-kellam reviewed Jun 24, 2026

View reviewed changes

brendan-kellam previously approved these changes Jun 24, 2026

View reviewed changes

refactor(web): always enable static-prefix prompt caching

cd7ae3e

Remove the SOURCEBOT_CHAT_PROMPT_CACHE_STATIC_PREFIX_ENABLED lever so the static checkpoint is always emitted, and switch the remaining cache env vars to JSDoc comments so their descriptions render on IDE hover.

jsourcebot dismissed brendan-kellam’s stale review via cd7ae3e June 24, 2026 03:29

jsourcebot merged commit 5e1b8ee into main Jun 24, 2026
9 checks passed

github-actions Bot mentioned this pull request Jun 24, 2026

Sourcebot Roadmap 🚀 #459

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve token caching.#1366

Improve token caching.#1366
jsourcebot merged 8 commits into
mainfrom
jminnetian/improve-token-caching

jsourcebot commented Jun 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 23, 2026 •

edited

Loading

Reviews paused

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

brendan-kellam Jun 24, 2026

Uh oh!

brendan-kellam Jun 24, 2026

Uh oh!

jsourcebot Jun 24, 2026

Uh oh!

brendan-kellam Jun 24, 2026

Uh oh!

jsourcebot Jun 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		const logger = createLogger('prompt-caching');

		export type CacheTtl = '5m' \| '1h';

Uh oh!

Conversation

jsourcebot commented Jun 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

brendan-kellam Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

brendan-kellam Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

jsourcebot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

brendan-kellam Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

jsourcebot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jsourcebot commented Jun 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 23, 2026 •

edited

Loading