Skip to content

Fix Base58 dropping leading zero bytes#44

Merged
dhondta merged 1 commit into
dhondta:mainfrom
gaoflow:fix-base58-leading-zeros
Jul 5, 2026
Merged

Fix Base58 dropping leading zero bytes#44
dhondta merged 1 commit into
dhondta:mainfrom
gaoflow:fix-base58-leading-zeros

Conversation

@gaoflow

@gaoflow gaoflow commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Problem

Base58 (and the other big-integer base codecs) silently drop leading null bytes:

import codext
codext.encode(b"\x00abc", "base58")  # 'ZiCa'  -> should be '1ZiCa'
codext.encode(b"\x00",    "base58")  # ''      -> should be '1'

base_encode/base_decode in src/codext/base/_base.py convert the whole input to a single integer (s2i) and back via divmod, so leading 0x00 bytes (high-order zeros) vanish. Per the Base58 specification the codec cites (and every reference implementation, e.g. the base58 PyPI library / Bitcoin Core), each leading 0x00 byte must map to a leading charset[0] character ('1' for the bitcoin alphabet). This also broke round-tripping for any value beginning with a null byte.

Fix

Preserve the leading-zero count on encode (prepend one charset[0] per leading \x00) and restore it on decode (prepend one \x00 per leading charset[0]). Both changes are guarded to the byte-input path so the integer recode used internally is untouched.

codext.encode(b"\x00abc", "base58")        # '1ZiCa'
codext.decode("1ZiCa", "base58")           # b'\x00abc'
codext.encode(b"\x01\x00", "base58")       # '5R'  (internal/trailing zeros unaffected)

Verified against the base58 reference library: 0 mismatches and 0 round-trip failures across random inputs (every leading-zero input failed before).

Test

Extended test_codec_base58 in tests/test_base.py with leading-null-byte encode/decode/round-trip assertions (str and bytes paths). Verified red→green: the test fails without the source change (AssertionError) and passes with it; the full test suite stays green (103 passed).


Disclosure: I use AI assistance (under my direction) for my contributions; I review and verify every change before submitting.

The generic base_encode/base_decode convert the whole input to a single
integer, so leading null bytes (high-order zeros) were silently lost: e.g.
Base58 encoded b'\x00abc' to 'ZiCa' instead of '1ZiCa', and b'\x00' to an
empty string. Per the Base58 spec each leading 0x00 byte maps to a leading
charset[0] character. Preserve the leading-zero count on encode and restore
it on decode, so values round-trip and match reference implementations.
@dhondta dhondta requested a review from Copilot July 5, 2026 16:16
@dhondta dhondta self-assigned this Jul 5, 2026
@dhondta dhondta added the failure Something does not work as expected label Jul 5, 2026
@mergify

mergify Bot commented Jul 5, 2026

Copy link
Copy Markdown

Tick the box to add this pull request to the merge queue (same as @mergifyio queue).

  • Queue this pull request

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes loss of leading 0x00 bytes in Base58 (and other codecs using the generic big-integer base_encode/base_decode) by preserving the leading-zero count during encoding and restoring it during decoding, enabling correct round-trips for values with leading null bytes.

Changes:

  • Update base_encode to prepend charset[0] once per leading \x00 in non-integer inputs.
  • Update base_decode to prepend \x00 once per leading charset[0] in the encoded input.
  • Extend Base58 tests to cover leading-null encode/decode behavior for both str and bytes call paths.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/codext/base/_base.py Preserves/restores leading-zero markers in generic base encode/decode, fixing Base58 leading-null handling.
tests/test_base.py Adds Base58 assertions for leading-null byte encode/decode and round-trip behavior (str and bytes paths).

@dhondta dhondta merged commit 4a087a1 into dhondta:main Jul 5, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

failure Something does not work as expected

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants