[core] Support snapshot sequence for write-only primary-key tables#8256
Open
Aitozi wants to merge 1 commit into
Open
[core] Support snapshot sequence for write-only primary-key tables#8256Aitozi wants to merge 1 commit into
Aitozi wants to merge 1 commit into
Conversation
2d89e5b to
4d57ea7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Primary-key table writers currently need to scan existing metadata to initialize the max sequence number. For write-only workloads this can be heavier than necessary because the writer only needs a safe starting sequence number.
What changed
This PR adds a
sequence.generation.modeoption with two modes:scan: keep the existing behavior and scan restored files for the max sequence number.snapshot: persist the max sequence number in snapshot properties and use it to initialize later write-only writers.For write-only primary-key tables in
snapshotmode, the writer can skip loading previous files once the latest snapshot carries the max sequence property. If the latest snapshot does not have the property yet, the writer scans once to bootstrap the snapshot property safely.Tests
git diff --checkmvn -s ~/.m2/apache-community.xml -o package -Pgenerate-docs -pl paimon-docs -nsu -DskipTests -ammvn -s ~/.m2/apache-community.xml -o -pl paimon-core -am -Pfast-build -DfailIfNoTests=false -Dtest=KeyValueFileStoreWriteTest testmvn -s ~/.m2/apache-community.xml -o -pl paimon-flink/paimon-flink-common -am -Pfast-build -DfailIfNoTests=false -Dtest=PrimaryKeyFileStoreTableITCase#testWriteOnlySnapshotSequenceOverwritePreviousValue test