Skip to content

Update tiktoken-rs requirement from 0.11 to 0.12#119

Merged
gorzell merged 1 commit into
mainfrom
dependabot/cargo/tiktoken-rs-0.12
Jun 4, 2026
Merged

Update tiktoken-rs requirement from 0.11 to 0.12#119
gorzell merged 1 commit into
mainfrom
dependabot/cargo/tiktoken-rs-0.12

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github Jun 3, 2026

Updates the requirements on tiktoken-rs to permit the latest version.

Release notes

Sourced from tiktoken-rs's releases.

v0.12.0

Summary

This release backports OpenAI tiktoken 0.13.0 into tiktoken-rs. The main reason to upgrade is better alignment with upstream tokenization behavior, especially the upstream Rust core changes for large BPE pieces and error-aware encoding.

For most users who call the high-level model/token counting helpers, this should behave the same aside from the new Rust compiler requirement. Users who call lower-level CoreBPE encoding methods directly should review the breaking changes below.

What Changed

  • Backported the vendored OpenAI tiktoken Rust core from 0.9.0 to 0.13.0.
  • Added the upstream large-piece BPE merge path. Functionally, this improves behavior for very large or repetitive inputs that previously stressed the merge algorithm.
  • Changed CoreBPE::encode to return Result<(Vec<Rank>, usize), EncodeError>, matching upstream. Regex/tokenization failures can now be reported instead of being hidden behind infallible APIs.
  • Updated encode_as and count to return Result because they call encode.
  • Re-exported EncodeError so callers can handle encode failures directly.
  • Aligned the vendored core with Rust 2024 and raised the crate MSRV to Rust 1.85.
  • Synced model-to-tokenizer mappings with upstream tiktoken 0.13.0 while keeping local extra prefixes isolated.
  • Hardened asset downloads with SHA-256 checks and a repo-root-aware asset path.

Breaking Changes

If your code calls CoreBPE::encode, unwrap or propagate the result before using the tokens:

let allowed = bpe.special_tokens();
let (tokens, last_piece_token_len) = bpe.encode("hello <|endoftext|>", &allowed)?;

The generic helpers changed similarly:

let (tokens, last_piece_token_len) = bpe.encode_as::<usize>(text, &allowed)?;
let token_count = bpe.count(text, &allowed)?;

encode_ordinary, encode_ordinary_as, encode_with_special_tokens, and count_ordinary remain infallible.

Projects must now build with Rust 1.85 or newer.

Practical Impact

  • Applications processing long repeated text should see more robust tokenization behavior.
  • Code that only uses helpers like get_chat_completion_max_tokens, get_text_completion_max_tokens, bpe_for_model, or singleton tokenizer constructors should not need call-site changes.
  • Code using low-level CoreBPE::encode, encode_as, or count needs a small migration to handle Result.

Links

Commits

@dependabot dependabot Bot added dependencies Pull requests that update a dependency file rust Pull requests that update Rust code labels Jun 3, 2026
@dependabot dependabot Bot requested a review from a team as a code owner June 3, 2026 22:19
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file rust Pull requests that update Rust code labels Jun 3, 2026
@gorzell gorzell enabled auto-merge June 4, 2026 05:38
@gorzell gorzell disabled auto-merge June 4, 2026 05:41
@gorzell
Copy link
Copy Markdown
Contributor

gorzell commented Jun 4, 2026

@dependabot rebase

Updates the requirements on [tiktoken-rs](https://github.com/zurawiki/tiktoken-rs) to permit the latest version.
- [Release notes](https://github.com/zurawiki/tiktoken-rs/releases)
- [Commits](zurawiki/tiktoken-rs@v0.11.0...v0.12.0)

---
updated-dependencies:
- dependency-name: tiktoken-rs
  dependency-version: 0.12.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot force-pushed the dependabot/cargo/tiktoken-rs-0.12 branch from 50ffdc8 to e21a2fa Compare June 4, 2026 05:43
@gorzell gorzell merged commit a006ef5 into main Jun 4, 2026
4 checks passed
@gorzell gorzell deleted the dependabot/cargo/tiktoken-rs-0.12 branch June 4, 2026 05:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant