Skip to content

Expose error recovery events in C listener API#14809

Open
zaidoon1 wants to merge 1 commit into
facebook:mainfrom
zaidoon1:c-api-eventlistener-recovery
Open

Expose error recovery events in C listener API#14809
zaidoon1 wants to merge 1 commit into
facebook:mainfrom
zaidoon1:c-api-eventlistener-recovery

Conversation

@zaidoon1
Copy link
Copy Markdown
Contributor

@zaidoon1 zaidoon1 commented Jun 1, 2026

Summary

  • expose background error Status severity through rocksdb_status_ptr_get_severity and C severity constants
  • add an additive rocksdb_eventlistener_create_with_error_recovery constructor for OnErrorRecoveryBegin and OnErrorRecoveryEnd callbacks
  • keep the existing rocksdb_eventlistener_create signature compatible by delegating without recovery callbacks
  • add C API smoke coverage for the new listener constructor and severity constants

Testing

  • make -C librocksdb-sys/rocksdb c_test -j4
  • ./c_test

@meta-cla meta-cla Bot added the CLA Signed label Jun 1, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 1, 2026

✅ clang-tidy: No findings on changed lines

Completed in 159.0s.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 1, 2026

Codex Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit 9ba7526


Codex review failed before producing findings.

WARNING: proceeding, even though we could not update PATH: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 1, 2026

Claude Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit 9ba7526


Summary

Clean, additive C API change that exposes error recovery listener events. The implementation is functionally correct and maintains backward compatibility. Two design-level concerns warrant attention before merge.

High-severity findings (2):

  • [include/rocksdb/c.h / db/c.cc] Exposes the DEPRECATED OnErrorRecoveryCompleted callback in a brand-new API surface instead of the non-deprecated OnErrorRecoveryEnd replacement.
  • [db/c_test.c] rocksdb_status_ptr_get_severity is completely untested; auto-recovery propagation logic is untested.
Full review (click to expand)

Findings

🔴 HIGH

H1. Exposes DEPRECATED OnErrorRecoveryCompletedinclude/rocksdb/listener.h:907
  • Issue: The PR wraps OnErrorRecoveryCompleted which is explicitly marked // DEPRECATED in the C++ interface. The non-deprecated replacement OnErrorRecoveryEnd (listener.h:921) provides richer information (BackgroundErrorRecoveryInfo with both old and new error status, including whether recovery succeeded or failed).
  • Root cause: The PR author likely targeted the simpler deprecated API. OnErrorRecoveryEnd has a struct parameter (BackgroundErrorRecoveryInfo) that requires additional C API wrapping.
  • Suggested fix: Either (a) expose OnErrorRecoveryEnd instead of OnErrorRecoveryCompleted, wrapping BackgroundErrorRecoveryInfo as a new opaque C type, or (b) expose both and document that OnErrorRecoveryCompleted is deprecated. Option (a) is preferred since this is a brand-new API addition -- there is no backward compatibility burden.
  • Consensus: Flagged independently by design, correctness, cross-component, and API reviewers (4/4 agreement).
H2. Insufficient test coverage for new functionality — db/c_test.c
  • Issue: The test is a smoke test only (create listener, check constants, destroy). Two pieces of new C-specific logic are untested:
    1. rocksdb_status_ptr_get_severity() is never called in any test.
    2. The auto_recovery bidirectional propagation (C callback sets unsigned char*, which is written back to bool*) is never exercised.
  • Root cause: The test follows the existing C API event listener pattern (creation/destruction only), but this PR introduces C-layer logic that doesn't exist in C++ tests.
  • Suggested fix: Add at minimum: (1) a test that creates a Status with known severity and calls rocksdb_status_ptr_get_severity, (2) a test or assertion validating the auto_recovery roundtrip. Full integration testing of callback invocation can rely on the existing C++ error_handler_fs_test.cc suite.
  • Consensus: Flagged by test reviewer as HIGH; confirmed by correctness reviewer.

🟡 MEDIUM

M1. Inconsistent null-checking pattern — db/c.cc:4148-4178
  • Issue: New callbacks null-check the function pointer (if (on_error_recovery_begin != nullptr)) while all existing callbacks (e.g., OnFlushBegin, OnBackgroundError) call their function pointer unconditionally.
  • Context: The null-checking is actually necessary for the new callbacks because rocksdb_eventlistener_create delegates to rocksdb_eventlistener_create_with_error_recovery with nullptr for the recovery callbacks. Without the null check, calling the old constructor would crash on the first background error recovery. So the null-checking is correct, but it creates an asymmetry where recovery callbacks are optional while all other callbacks are mandatory.
  • Suggested fix: Document this design decision in a code comment. Optionally, consider adding null-checks to existing callbacks in a follow-up for consistency.
M2. Inconsistent rocksdb_status_ptr_t allocation — db/c.cc:4148-4178 vs db/c.cc:4138
  • Issue: New callbacks use stack-allocated rocksdb_status_ptr_t s; while existing OnBackgroundError uses new rocksdb_status_ptr_t. Both are functionally correct (the pointer doesn't escape the synchronous callback scope), but the inconsistency is confusing for maintainers.
  • Root cause: Different C++ signatures -- OnBackgroundError receives Status* (non-owning pointer) while recovery callbacks receive Status by value, making stack allocation the natural choice.
  • Suggested fix: Add a brief comment explaining why the allocation strategy differs, or unify to stack allocation in a follow-up.
M3. Missing parameter names in callback typedefs — include/rocksdb/c.h
  • Issue: The new typedefs lack parameter names:
    typedef void (*on_error_recovery_begin_cb)(void*, uint32_t, rocksdb_status_ptr_t*, unsigned char*);
    This matches the existing pattern (e.g., on_background_error_cb) but harms usability, especially for the unsigned char* parameter whose meaning (auto_recovery flag) is non-obvious.
  • Suggested fix: Add parameter names for documentation:
    typedef void (*on_error_recovery_begin_cb)(void* state, uint32_t reason, rocksdb_status_ptr_t* bg_error, unsigned char* auto_recovery);

🟢 LOW / NIT

L1. Severity enum alignment risk — include/rocksdb/c.h
  • Issue: The C enum values (0-4) must stay synchronized with Status::Severity in status.h:129. Currently correct, but there's no compile-time assertion.
  • Suggested fix: Add a static_assert in c.cc verifying value alignment.
L2. Unnecessary defensive null-check on auto_recoverydb/c.cc
  • Issue: The code checks auto_recovery != nullptr && *auto_recovery but the call site in event_helpers.cc:64 only invokes OnErrorRecoveryBegin when auto_recovery is guaranteed non-null.
  • Suggested fix: Keep the check -- defensive programming is appropriate for a public API callback.
L3. Constructor scalability — include/rocksdb/c.h
  • Issue: Each new EventListener callback requires a new constructor variant. The parameter list is now 14 parameters.
  • Suggested fix: Consider a struct-based or builder API for future additions. Not blocking for this PR.

Cross-Component Analysis

Context Code executes? Assumptions hold? Action needed?
WritePreparedTxnDB Yes (if listener registered) Yes Safe
ReadOnly DB No (no background errors) N/A Safe
CompactionService Yes (if listener registered) Yes Safe
User-defined timestamps Yes (if listener registered) Yes Safe
FIFO/Universal compaction Yes (if listener registered) Yes Safe

This is a pure C API wrapper -- no core logic changes. All execution contexts are safe.

Positive Observations

  • Backward compatibility is well preserved via delegation pattern
  • Null-checking for new optional callbacks is the right call (prevents crash when old constructor is used)
  • Stack allocation for rocksdb_status_ptr_t is slightly more efficient than the heap pattern used by OnBackgroundError
  • PermitUncheckedError() is correctly placed in both new callbacks
  • Severity enum values correctly match C++ Status::Severity
  • Test follows existing C API test conventions

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

@zaidoon1 zaidoon1 force-pushed the c-api-eventlistener-recovery branch from 9ba7526 to ad3243e Compare June 2, 2026 00:23
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Codex Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit ad3243e


Codex review failed before producing findings.

WARNING: proceeding, even though we could not update PATH: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Claude Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit ad3243e


Summary

Clean, well-structured C API addition that follows existing patterns. The delegation from rocksdb_eventlistener_create to the new constructor is correct and backwards-compatible. Stack-local wrappers are safe for synchronous callbacks and more efficient than the existing heap allocation pattern in OnBackgroundError.

High-severity findings (1):

  • [db/c.cc] Test-only invoke helpers (rocksdb_eventlistener_test_invoke_error_recovery_begin/end) are exported via ROCKSDB_LIBRARY_API, polluting the public shared library API.
Full review (click to expand)

Findings

🔴 HIGH

H1. Test helpers exposed in public API -- db/c.cc
  • Issue: rocksdb_eventlistener_test_invoke_error_recovery_begin() and rocksdb_eventlistener_test_invoke_error_recovery_end() are declared ROCKSDB_LIBRARY_API, making them part of the public shared library ABI. No other test-only helper functions exist in the C API. The test file (c_test.c) uses extern declarations to access them, confirming they are test-only.
  • Root cause: Functions intended solely for testing are exported with the same visibility as production API functions.
  • Suggested fix: Remove ROCKSDB_LIBRARY_API from both functions, or guard them with #ifdef ROCKSDB_TESTING, or declare them static and move the test invocation logic into the test file itself. Alternatively, keep them internal-linkage in c.cc and have the test call through the C++ listener interface directly (as the Java tests do in testable_event_listener.cc).

🟡 MEDIUM

M1. Inconsistent allocation pattern between OnBackgroundError and OnErrorRecoveryBegin -- db/c.cc:4185-4208
  • Issue: The existing OnBackgroundError override heap-allocates rocksdb_status_ptr_t (new/delete), while the new OnErrorRecoveryBegin uses stack allocation. Both are synchronous callbacks where stack allocation is safe and more efficient.
  • Root cause: The new code uses a better pattern than the existing code, but the inconsistency could confuse maintainers.
  • Suggested fix: Not a bug -- the stack approach is preferable. Consider a follow-up to update OnBackgroundError to also use stack allocation.
M2. Constructor proliferation pattern -- include/rocksdb/c.h
  • Issue: Adding rocksdb_eventlistener_create_with_error_recovery() extends the positional parameter list to 14 arguments. Future listener callbacks would require more variants.
  • Suggested fix: Follows existing RocksDB C API conventions. Acceptable for now.

🟢 LOW / NIT

L1. Redundant PermitUncheckedError in OnErrorRecoveryEnd -- db/c.cc:4218-4219
  • Issue: Both the C wrapper and EventHelpers::NotifyOnErrorRecoveryEnd() call PermitUncheckedError() on the same Status objects. Safe (idempotent) but redundant.
L2. No null check on auto_recovery in test helper -- db/c.cc:4299
  • Issue: Test helper dereferences auto_recovery without null check. Minor since test-only code always called with valid pointers.

Cross-Component Analysis

Context Executes? Safe? Notes
WritePreparedTxnDB Yes Yes No interaction with transaction visibility
ReadOnly DB Unlikely Yes Callbacks are no-ops if not registered
CompactionService Yes Yes Listeners execute locally
FIFO/Universal compaction Yes Yes Error recovery is compaction-style agnostic

Positive Observations

  • Static asserts for severity constants are excellent practice.
  • Stack-local allocation is safer and more efficient than the existing heap pattern.
  • Clean backwards-compatible delegation from old to new constructor.
  • Good test coverage of successful/failed recovery, severity, auto_recovery toggling, and destructor.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

@zaidoon1 zaidoon1 force-pushed the c-api-eventlistener-recovery branch 2 times, most recently from 53ada4e to f88fc70 Compare June 2, 2026 02:14
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Codex Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit f88fc70


Codex review failed before producing findings.

WARNING: proceeding, even though we could not update PATH: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Claude Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit f88fc70


Summary

Additive C API exposing OnErrorRecoveryBegin and OnErrorRecoveryEnd callbacks plus Status severity accessors. The implementation is generally sound and backwards-compatible. The main concern is that the new constructor pattern allows passing nullptr for non-recovery callbacks that lack null guards, creating a latent crash risk.

High-severity findings (1):

  • [db/c.cc:4119-4152] The new rocksdb_eventlistener_create_with_error_recovery constructor accepts all callback slots positionally, but the existing event override methods (e.g., OnFlushBegin, OnCompactionBegin) call their function pointers unconditionally without null checks. If a user passes nullptr for any non-recovery callback via the new constructor, a null function pointer dereference will crash.
Full review (click to expand)

Findings

🔴 HIGH

H1. Null function pointer dereference for non-recovery callbacks -- db/c.cc:4092-4152
  • Issue: The new rocksdb_eventlistener_create_with_error_recovery() constructor accepts all callback parameters positionally, including the original callbacks (flush, compaction, etc.). The old constructor implicitly required all callbacks to be non-null since the override methods call them unconditionally. The new constructor has no such enforcement. If a user passes nullptr for, say, on_flush_begin, the OnFlushBegin() override at line 4092 will dereference a null function pointer.
  • Root cause: The recovery callbacks correctly null-check before invocation (if (on_error_recovery_begin != nullptr)), but the existing callbacks never had null guards because the old constructor's callers always provided non-null pointers. The new constructor opens a path to null.
  • Suggested fix: Either: (a) add null checks to ALL existing override methods (consistent with the new recovery callback pattern), or (b) add a runtime assertion/check in rocksdb_eventlistener_create_with_error_recovery() that validates non-null for the required callbacks. Option (a) is preferable for forward compatibility and consistency.

🟡 MEDIUM

M1. Scalability concern with additive constructor pattern -- include/rocksdb/c.h
  • Issue: The rocksdb_eventlistener_create_with_error_recovery approach adds 2 new parameters to a 12-parameter constructor, creating a 14-parameter function. Future EventListener callback additions will require yet another constructor variant (e.g., create_with_error_recovery_and_blob_events), leading to combinatorial explosion.
  • Suggested fix: Consider a setter-based API for optional callbacks: create with rocksdb_eventlistener_create(), then call rocksdb_eventlistener_set_on_error_recovery_begin(listener, callback). This scales indefinitely and avoids the positional parameter problem. However, the current approach is acceptable for this PR if the team prefers consistency with the existing pattern.
M2. Missing test for null recovery callbacks via old constructor path -- db/listener_test.cc
  • Issue: No test verifies that a listener created via the original rocksdb_eventlistener_create() (which passes nullptr for recovery callbacks) correctly skips recovery callbacks when OnErrorRecoveryBegin/OnErrorRecoveryEnd are triggered. While the code null-checks correctly, this behavior should be tested.
  • Suggested fix: Add a test that creates a listener via rocksdb_eventlistener_create(), triggers OnErrorRecoveryBegin and OnErrorRecoveryEnd on it, and verifies no crash and no callback invocation.
M3. C test does not exercise callback functionality -- db/c_test.c:5097-5138
  • Issue: The c_test smoke test only verifies API creation/destruction and severity constants. It doesn't invoke any callbacks, so it provides no coverage for the actual callback wiring. The c_test callbacks (CTestOnErrorRecoveryBeginCheck, CTestOnErrorRecoveryEndCheck) are defined but never invoked.
  • Suggested fix: Either trigger the callbacks in c_test or remove the unused callback implementations and simplify the test. Currently the elaborate callback implementations in c_test.c (lines 96-225) create dead code.

🟢 LOW / NIT

L1. rocksdb_status_ptr_get_severity parameter should be const -- include/rocksdb/c.h
  • Issue: The function takes rocksdb_status_ptr_t* (non-const) but only reads from the status. It should take const rocksdb_status_ptr_t* for correctness.
  • Note: The existing rocksdb_status_ptr_get_error also takes non-const, so this is consistent with the current API but both are technically wrong.
L2. Stack-local wrapper pattern differs from existing OnBackgroundError -- db/c.cc:4185-4221
  • Issue: The new OnErrorRecoveryBegin uses stack-local rocksdb_status_ptr_t s while the existing OnBackgroundError at line 4138 uses new/delete for the same wrapper. The stack allocation is actually better (no heap allocation on error path), but the inconsistency is worth noting. Consider updating OnBackgroundError to match the new pattern in a follow-up.
L3. Unused callback stub functions in c_test.c -- db/c_test.c:96-225
  • Issue: Multiple callback stubs (CTestOnFlushBegin, CTestOnCompactionBegin, etc.) and the full CTestErrorRecoveryState struct with CTestOnErrorRecoveryBeginCheck/CTestOnErrorRecoveryEndCheck are defined but the callbacks are never actually fired. This is dead code that inflates the test file.

Cross-Component Analysis

Context Does code execute? Assumptions hold? Action needed?
WritePreparedTxnDB YES (inherits listeners) YES Safe -- C wrapper just forwards
ReadOnly DB YES (error handler exists) YES Safe -- pre-existing C++ behavior
CompactionService YES YES Safe -- listener ownership follows DB
User-defined timestamps N/A N/A No interaction
Old snapshots N/A N/A No interaction

The NotifyOnBackgroundError call sequence (line 62-66 of event_helpers.cc) is single-threaded within the listener loop iteration. The OnBackgroundError receives a mutable Status* and OnErrorRecoveryBegin receives *bg_error by value. This is intentional C++ API behavior -- the C wrapper correctly mirrors it. No data race exists because calls are sequential.

The PermitUncheckedError() calls on const BackgroundErrorRecoveryInfo& members are valid because PermitUncheckedError() is a const method that modifies mutable checked_.

Positive Observations

  • Backwards compatibility preserved: The old rocksdb_eventlistener_create() delegates cleanly to the new constructor with nullptr for recovery callbacks. No ABI break.
  • static_assert guards: Severity enum values are validated at compile time against the C++ enum, preventing silent drift.
  • Stack allocation improvement: Using stack-local rocksdb_status_ptr_t instead of heap allocation is a performance improvement over the existing OnBackgroundError pattern.
  • PermitUncheckedError placement: Correctly placed outside the null-check guard, ensuring Status objects are always marked as checked regardless of whether a callback is registered.
  • Comprehensive C++ test: The CApiErrorRecoveryCallbacks test in listener_test.cc thoroughly validates the callback wiring, parameter conversion, and auto_recovery propagation.
  • on_background_error_cb typedef change: Adding parameter names is a source- and ABI-compatible improvement that aids documentation.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

Expose EventListener::OnErrorRecoveryEnd through the C listener wrapper instead of the deprecated OnErrorRecoveryCompleted hook. The C callback now receives an opaque BackgroundErrorRecoveryInfo wrapper so callers can inspect both the old and new background error status, including severity.

Add compile-time severity alignment checks and cover the new C listener recovery callbacks without adding test-only helper functions to the RocksDB C library ABI. c_test verifies public C API construction and ownership, while listener_test deterministically exercises status severity access, auto_recovery propagation, recovery-end old/new status handling, null recovery callbacks through the legacy constructor, and null callback slots through the new constructor.

Make C event-listener callback slots optional before forwarding, so callers that only register a subset of callbacks cannot crash on unrelated listener events.
@zaidoon1 zaidoon1 force-pushed the c-api-eventlistener-recovery branch from f88fc70 to 75cce31 Compare June 2, 2026 02:49
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit 75cce31


Codex review failed before producing findings.

WARNING: proceeding, even though we could not update PATH: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit 75cce31


Summary

Solid, well-structured addition to the C API that follows existing patterns and maintains backward compatibility. The null-safety improvements and heap-to-stack optimization in OnBackgroundError are welcome cleanup. The core error recovery bridging logic is correct.

High-severity findings (1):

  • [include/rocksdb/c.h] Missing API documentation for pointer lifetime constraints -- C callbacks receive pointers to stack-allocated wrappers that must not be stored beyond callback scope.
Full review (click to expand)

Findings

🔴 HIGH

H1. Missing pointer lifetime documentation for callback parameters -- include/rocksdb/c.h
  • Issue: The new callback typedefs on_error_recovery_begin_cb and on_error_recovery_end_cb pass pointers to stack-allocated wrapper objects (rocksdb_status_ptr_t* and rocksdb_backgrounderrorrecoveryinfo_t*). C API consumers have no indication these pointers are only valid during the callback invocation. Storing them for later use causes use-after-free.
  • Root cause: The C API header has no comments on the new functions or callback types explaining the lifetime constraint. This is also true for the existing on_background_error_cb (which had the same issue with heap allocation, now changed to stack allocation).
  • Suggested fix: Add comments to the callback typedefs and the rocksdb_eventlistener_create_with_error_recovery() declaration in c.h stating that all pointer parameters in callbacks are only valid for the duration of the callback and must not be retained. Example:
    /* Note: bg_error and auto_recovery pointers are only valid during
       the callback invocation and must not be stored. */
    typedef void (*on_error_recovery_begin_cb)(void* state, uint32_t reason,
                                               rocksdb_status_ptr_t* bg_error,
                                               unsigned char* auto_recovery);

🟡 MEDIUM

M1. Redundant PermitUncheckedError() calls in OnErrorRecoveryEnd -- db/c.cc
  • Issue: The override calls info.old_bg_error.PermitUncheckedError() and info.new_bg_error.PermitUncheckedError(), but event_helpers.cc:256-257 already calls these on the same objects immediately after OnErrorRecoveryEnd() returns. The calls are idempotent (PermitUncheckedError is const and sets checked_ = true), so this is harmless but creates ambiguity about whose responsibility it is.
  • Root cause: Defensive programming -- the override doesn't trust the caller to clean up.
  • Suggested fix: This is a judgment call. The redundant calls provide defense-in-depth. If kept, add a comment explaining they're intentionally redundant. Alternatively, remove them and rely on the engine's cleanup (matching how OnBackgroundError relies on event_helpers.cc:63).
M2. Constructor parameter count scalability -- include/rocksdb/c.h
  • Issue: rocksdb_eventlistener_create_with_error_recovery() takes 14 parameters. Adding future callbacks (e.g., OnBlobFileCreated, OnFileCloseFinish) will require either another _with_* variant or extending this already-long parameter list.
  • Root cause: C API doesn't have a builder/struct pattern for callback registration.
  • Suggested fix: Consider this for a future refactor. The current approach is consistent with existing _with_* patterns (35+ instances in the codebase) and is acceptable for now.
M3. on_background_error_cb typedef parameter name addition -- include/rocksdb/c.h
  • Issue: The existing on_background_error_cb typedef changed from unnamed parameters (void*, uint32_t, rocksdb_status_ptr_t*) to named parameters (void* state, uint32_t reason, rocksdb_status_ptr_t* status). While ABI-compatible, this is a source-level change to an existing typedef.
  • Root cause: Consistency improvement with the new callback typedefs.
  • Suggested fix: Acceptable as-is, but note it in the change description.

🟢 LOW / NIT

L1. static_assert does not cover kMaxSeverity sentinel -- db/c.cc
  • Issue: The severity static_assert block validates values 0-4 but doesn't assert that the C API covers all valid severity values. If a new severity is added, the C API would silently miss it.
  • Suggested fix: Add static_assert(rocksdb_status_severity_unrecoverable_error + 1 == static_cast<unsigned char>(Status::Severity::kMaxSeverity)) to catch future additions.
L2. Null-check on destructor is defensive but warrants comment -- db/c.cc
  • Issue: Adding if (destructor_ != nullptr) changes the contract. Previously, passing nullptr as destructor would crash. Now it silently skips.
  • Suggested fix: Acceptable as-is.
L3. Test uses reinterpret_cast<EventListener*>(listener) -- db/listener_test.cc
  • Issue: static_cast would be more idiomatic since rocksdb_eventlistener_t inherits from EventListener.
  • Suggested fix: Replace with static_cast<EventListener*>(listener).
L4. Test coverage is unit-level only -- db/listener_test.cc, db/c_test.c
  • Issue: No integration test triggers actual error recovery through the DB. However, error_handler_fs_test.cc already covers the C++ path extensively, so bridge-layer unit tests are sufficient.
  • Suggested fix: Acceptable.

Cross-Component Analysis

Context Executes? Safe? Notes
WritePreparedTxnDB YES YES Same listener mechanism
ReadOnly DB NO N/A No error recovery events
CompactionService YES YES Listeners forwarded
Concurrent writers YES YES Stack-local per call
Multiple listeners YES YES auto_recovery skip behavior is existing C++ semantics

Positive Observations

  • Heap-to-stack optimization in OnBackgroundError eliminates unnecessary new/delete
  • Null-safety for all callbacks enables selective callback registration without crashes
  • static_assert for severity constants provides compile-time validation
  • Proper auto_recovery bridging with null-safety and != 0 back-conversion
  • Good test coverage: happy path, null safety, and legacy compatibility
  • Consistent pointer-wrapper pattern matching rocksdb_status_ptr_t

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant