Improve token caching.#1366
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a ChangesEE Ask Sourcebot Prompt Caching
Sequence Diagram(s)sequenceDiagram
participant Client
participant ChatRoute as chat/route.ts
participant Agent as createAgentStream
participant Prompt as createPrompt
participant CacheStrategy as PromptCacheStrategy
participant StreamText as streamText (AI SDK)
Client->>ChatRoute: POST /api/ee/chat
ChatRoute->>CacheStrategy: getPromptCacheStrategy(provider, enabled)
CacheStrategy-->>ChatRoute: strategy (Anthropic or no-op)
ChatRoute->>Agent: createMessageStream({ promptCacheStrategy })
Agent->>Prompt: createPrompt(sortedRepos)
Prompt-->>Agent: { staticPrompt, dynamicPrompt }
Agent->>CacheStrategy: strategy.cacheControl({ ttl: staticTtl })
CacheStrategy-->>Agent: staticMarker (providerOptions)
Agent->>StreamText: systemMessages[0] with staticMarker + dynamicPrompt
loop each agent step
StreamText->>Agent: prepareStep(stepMessages)
Agent->>Agent: move tailMarker onto last message
Agent-->>StreamText: messages with tailMarker applied
StreamText-->>Agent: stepResult { cacheReadTokens }
Agent->>Agent: detectUnexpectedCacheMiss(stepIndex, cacheReadTokens)
end
StreamText-->>Client: streamed response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@CHANGELOG.md`:
- Line 11: In the CHANGELOG.md file, locate the line containing the Ask
Sourcebot prompt caching improvement entry under the [Unreleased] section.
Replace both instances of the placeholder `<id>` in the markdown link reference
`[#<id>](https://github.com/sourcebot-dev/sourcebot/pull/<id>)` with the actual
pull request number for this change. The same numeric PR id should be used in
both the link text and the URL to create a valid GitHub PR link.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 70bb4df8-9bdb-46d6-994c-b4004e501537
📒 Files selected for processing (10)
CHANGELOG.mdpackages/shared/src/env.server.tspackages/web/src/app/api/(server)/ee/chat/route.tspackages/web/src/ee/features/chat/agent.test.tspackages/web/src/ee/features/chat/agent.tspackages/web/src/ee/features/chat/mcp/mcpClientFactory.tspackages/web/src/ee/features/chat/mcp/mcpToolRegistry.tspackages/web/src/ee/features/chat/promptCaching.test.tspackages/web/src/ee/features/chat/promptCaching.tspackages/web/src/ee/features/mcp/askCodebase.ts
Adds a divergence-proof static front checkpoint (cross-chat reuse of tool + static-system bytes) and an MCP activation-resilience breakpoint on top of the existing moving tail marker, behind a provider-aware resolver that is a no-op for non-Anthropic providers. Splits the system prompt into static/dynamic blocks and hardens MCP ordering for byte stability, all gated by new env flags.
The moving tail breakpoint was set once on the last input message before streamText's loop, so a turn's tool calls and outputs accumulated past it and were reprocessed uncached on each later step. Apply it in prepareStep to the last message of every step instead, caching the growing in-turn delta incrementally. prepareStep now runs without MCP too, and stays a no-op for non-Anthropic providers.
…ignature cacheBreakSnapshots was keyed by chatId and never evicted, so with cache-break detection enabled it grew with the cumulative number of distinct chats served. Add a FIFO cap that drops the oldest entry on overflow, and replace the hand-rolled djb2 signature hash with a sha256 slice matching getOAuthScopeHash (observability-only and compared in-process, so determinism is all it needs).
Marker 1 only saved re-writing the built-in tool schemas on mid-turn MCP activation steps, and only when those schemas cleared the model's minimum cacheable size. The static-system checkpoint and moving tail carry the value, so this collapses the scheme to two breakpoints and removes the activeTools insertion-order reasoning it required.
Remove stale references to the dropped tools-block breakpoint and tighten verbose prompt-caching comments. Comments only, no code changes.
7de1198 to
8610781
Compare
|
|
||
| SOURCEBOT_CHAT_MAX_STEP_COUNT: numberSchema.default(100), | ||
| SOURCEBOT_CHAT_PROMPT_CACHING_ENABLED: booleanSchema.default('true'), | ||
| // Phased-rollout lever for the static checkpoint. Set to 'false' to fall |
There was a problem hiding this comment.
nit: can we use /** **/ comments s.t., we get inline JSDoc rendering when hovering over these symbols in the ide?
| // Phased-rollout lever for the static checkpoint. Set to 'false' to fall | ||
| // back to the single moving tail marker. Only takes effect when prompt | ||
| // caching is enabled. | ||
| SOURCEBOT_CHAT_PROMPT_CACHE_STATIC_PREFIX_ENABLED: booleanSchema.default('true'), |
There was a problem hiding this comment.
Is there advantage of making this configurable?
There was a problem hiding this comment.
Mainly a safe guard in case some issue makes the static portion not cache properly, you can disable it so you stop paying the extra cost for no benefit. But it's maybe overly defensive, leaning towards removing it and keeping the env var to a minimal required set.
|
|
||
| const logger = createLogger('prompt-caching'); | ||
|
|
||
| export type CacheTtl = '5m' | '1h'; |
There was a problem hiding this comment.
can the TTL only be 5m or 1h?
There was a problem hiding this comment.
Yes, at least for anthropic
Remove the SOURCEBOT_CHAT_PROMPT_CACHE_STATIC_PREFIX_ENABLED lever so the static checkpoint is always emitted, and switch the remaining cache env vars to JSDoc comments so their descriptions render on IDE hover.
Improved Ask Sourcebot prompt caching by splitting static and dynamic prompt sections and advancing cache breakpoints after every agent step instead of only after each message
Summary by CodeRabbit