Skip to content

[WIP]Support range-based reads for deletion vectors#3478

Open
KaiqiJinWow wants to merge 1 commit into
apache:mainfrom
KaiqiJinWow:fix-dv-content-range-read
Open

[WIP]Support range-based reads for deletion vectors#3478
KaiqiJinWow wants to merge 1 commit into
apache:mainfrom
KaiqiJinWow:fix-dv-content-range-read

Conversation

@KaiqiJinWow

Copy link
Copy Markdown

Summary

  • Expose Iceberg V3 deletion vector content range fields on DataFile
  • Read Puffin deletion vectors from manifest-described content ranges when content_offset/content_size_in_bytes are present
  • Validate deletion vector blobs for length, magic number, CRC, and cardinality while preserving existing whole-file Puffin reads

Testing

  • .venv/bin/python -m pytest -q tests/table/test_puffin.py tests/io/test_pyarrow.py::test_read_deletion_vector_blob_from_content_range tests/io/test_pyarrow.py::test_read_deletes

@KaiqiJinWow KaiqiJinWow changed the title Support range-based reads for deletion vectors [WIP]Support range-based reads for deletion vectors Jun 11, 2026
@KaiqiJinWow KaiqiJinWow force-pushed the fix-dv-content-range-read branch 2 times, most recently from fdc8d3b to 859efdc Compare June 11, 2026 21:37
@KaiqiJinWow KaiqiJinWow force-pushed the fix-dv-content-range-read branch from 859efdc to 118c561 Compare June 11, 2026 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant