Fix Base58 dropping leading zero bytes#44
Merged
Conversation
The generic base_encode/base_decode convert the whole input to a single integer, so leading null bytes (high-order zeros) were silently lost: e.g. Base58 encoded b'\x00abc' to 'ZiCa' instead of '1ZiCa', and b'\x00' to an empty string. Per the Base58 spec each leading 0x00 byte maps to a leading charset[0] character. Preserve the leading-zero count on encode and restore it on decode, so values round-trip and match reference implementations.
dhondta
approved these changes
Jul 5, 2026
|
Tick the box to add this pull request to the merge queue (same as
|
There was a problem hiding this comment.
Pull request overview
Fixes loss of leading 0x00 bytes in Base58 (and other codecs using the generic big-integer base_encode/base_decode) by preserving the leading-zero count during encoding and restoring it during decoding, enabling correct round-trips for values with leading null bytes.
Changes:
- Update
base_encodeto prependcharset[0]once per leading\x00in non-integer inputs. - Update
base_decodeto prepend\x00once per leadingcharset[0]in the encoded input. - Extend Base58 tests to cover leading-null encode/decode behavior for both
strandbytescall paths.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/codext/base/_base.py |
Preserves/restores leading-zero markers in generic base encode/decode, fixing Base58 leading-null handling. |
tests/test_base.py |
Adds Base58 assertions for leading-null byte encode/decode and round-trip behavior (str and bytes paths). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Base58 (and the other big-integer base codecs) silently drop leading null bytes:
base_encode/base_decodeinsrc/codext/base/_base.pyconvert the whole input to a single integer (s2i) and back viadivmod, so leading0x00bytes (high-order zeros) vanish. Per the Base58 specification the codec cites (and every reference implementation, e.g. thebase58PyPI library / Bitcoin Core), each leading0x00byte must map to a leadingcharset[0]character ('1'for the bitcoin alphabet). This also broke round-tripping for any value beginning with a null byte.Fix
Preserve the leading-zero count on encode (prepend one
charset[0]per leading\x00) and restore it on decode (prepend one\x00per leadingcharset[0]). Both changes are guarded to the byte-input path so the integer recode used internally is untouched.Verified against the
base58reference library: 0 mismatches and 0 round-trip failures across random inputs (every leading-zero input failed before).Test
Extended
test_codec_base58intests/test_base.pywith leading-null-byte encode/decode/round-trip assertions (str and bytes paths). Verified red→green: the test fails without the source change (AssertionError) and passes with it; the full test suite stays green (103 passed).Disclosure: I use AI assistance (under my direction) for my contributions; I review and verify every change before submitting.