Skip to content

Add bounded multipart upload parallelism#228

Open
jiuker wants to merge 5 commits into
minio:mainfrom
jiuker:fix-issue-216
Open

Add bounded multipart upload parallelism#228
jiuker wants to merge 5 commits into
minio:mainfrom
jiuker:fix-issue-216

Conversation

@jiuker

@jiuker jiuker commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

fix: #216

Add bounded multipart upload parallelism

PutObject now supports concurrent UploadPart calls via the new PutObjectArgs::max_inflight_parts field (default 1 = sequential behavior preserved).

Changes

File Change
include/miniocpp/args.h Added unsigned int max_inflight_parts = 1 to PutObjectArgs
src/args.cc Validate max_inflight_parts > 0
src/client.cc Refactored PutObject into sequential + parallel branches
tests/tests.cc Added PutObjectWithInflight() — MD5 integrity test

src/client.cc highlights

  • Sequential path (max_inflight_parts <= 1): Original single-buffer loop, unchanged.
  • Parallel path (max_inflight_parts > 1):
    • N page-aligned buffers allocated via AlignedBuffer pool.
    • Producer/consumer model: main thread reads stream, std::async dispatches UploadPart.
    • Bounded inflight cap via std::deque + drain_one lambda.
    • FIFO ordering preserved by part number — CompleteMultipartUpload doesn't care about completion order.
  • Bug fix: Corrected one_byte read-ahead indexing in the unknown-size path (buf[part_size] instead of buf[part_size + 1]).

Testing

PutObjectWithInflight() generates 10MB of random data, computes the original MD5, uploads with max_inflight_parts = 1, 2, 4, downloads each object, and verifies MD5 matches the original.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added max_inflight_parts to PutObjectArgs (default: 1) to control concurrent multipart part uploads.
    • Enabled bounded parallel multipart uploads when max_inflight_parts > 1.
  • Bug Fixes

    • Added validation to reject max_inflight_parts = 0.
    • Corrected multipart behavior for unknown final parts and improved progress-callback abortion.
    • Made credential fetching thread-safe.
  • Tests

    • Added an end-to-end MD5 verification test for multipart uploads using max_inflight_parts = 1, 2, and 4.

@jiuker jiuker requested a review from harshavardhana June 5, 2026 09:44
@harshavardhana

Copy link
Copy Markdown
Member

@coderabbitai

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

@harshavardhana Sure, I'll review the PR!

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 487255f3-efc8-4c5d-8833-0f38be55c48f

📥 Commits

Reviewing files that changed from the base of the PR and between 6111b69 and 7d05f5d.

📒 Files selected for processing (1)
  • src/client.cc
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/client.cc

📝 Walkthrough

Walkthrough

PutObjectArgs gains a max_inflight_parts field (default 1) validated to be non-zero. Client::PutObject is refactored into two branches: the existing sequential path with adjusted lookahead logic for unknown final parts (≤1 inflight) and progress tracking via std::optional<double>, and a new concurrent path (>1 inflight) that allocates multiple AlignedBuffers, launches UploadPart calls via std::async, and drains completions through a deque. Credential fetching is synchronized via mutex to support concurrent scenarios. An integration test exercises inflight values 1, 2, and 4 with MD5 verification.

Changes

Concurrent multipart upload and credential synchronization

Layer / File(s) Summary
PutObjectArgs field and validation
include/miniocpp/args.h, src/args.cc
Adds unsigned int max_inflight_parts = 1 to PutObjectArgs and rejects zero in Validate().
Thread-safe credential fetching
include/miniocpp/providers.h, src/providers.cc
Adds <mutex> include and private std::mutex fetch_mutex_ to ChainedProvider; acquires the lock in Fetch() to serialize credential cache and provider fetch operations.
AlignedBuffer move semantics and includes
src/client.cc
Adds <deque>, <future>, <vector> and related includes; implements move construction and move assignment on AlignedBuffer to enable multi-buffer ownership transfer.
Sequential upload path adjustments
src/client.cc
Refactors sequential multipart read loop via a read_part_data helper returning error::Error; fixes unknown final part lookahead bookkeeping by reading the saved byte from buf[part_size] instead of buf[part_size + 1]; changes progress upload-speed tracking to std::optional<double> with averaged samples and safe final reporting via value_or(-1.0).
Concurrent inflight upload pipeline
src/client.cc
Implements the max_inflight_parts > 1 branch: allocates multiple page-aligned buffers, optionally registers RDMA buffers, reads parts with lookahead handling, schedules UploadPart via std::async, tracks inflight work in a deque, drains completions, handles single-part fallback, checks progressfunc return to abort on false, propagates early failures with multipart abort, and calls CompleteMultipartUpload on success.
PutObjectWithInflight integration test
tests/tests.cc
Uploads ~100 MiB with max_inflight_parts of 1, 2, and 4; downloads each result; verifies MD5 integrity; removes the object. Registered in main() after PutObject().

Sequence Diagram(s)

sequenceDiagram
    participant App
    participant PutObject as Client::PutObject
    participant AsyncPool as std::async workers
    participant Server as MinIO Server

    App->>PutObject: PutObject(args{max_inflight_parts=N})
    PutObject->>PutObject: Validate() — reject if max_inflight_parts==0
    PutObject->>Server: InitiateMultipartUpload → upload_id

    rect rgba(70, 130, 180, 0.5)
        note over PutObject,AsyncPool: Concurrent path (N > 1)
        loop for each part buffer (round-robin)
            PutObject->>AsyncPool: std::async UploadPart(buffer, part_num)
            AsyncPool->>Server: PUT ?uploadId&partNumber=K
            Server-->>AsyncPool: ETag
        end
        PutObject->>AsyncPool: drain_one() per completed future
        AsyncPool-->>PutObject: Part{etag, checksum}
    end

    PutObject->>Server: CompleteMultipartUpload(parts[])
    Server-->>App: result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

  • perf: parallel multipart UploadPart #216: This PR directly implements the proposal in issue #216 — adding PutObjectArgs::max_inflight_parts with a default of 1, refactoring Client::PutObject into a concurrent producer/consumer pipeline using std::async, allocating and optionally registering multiple page-aligned RDMA buffers for parallel part uploads, and introducing thread-safe credential fetching via mutex protection in ChainedProvider.

Suggested reviewers

  • harshavardhana

Poem

🐇 Hop hop, the parts now fly in parallel streams,
No waiting in line — async uploads, it seems!
A deque of futures, a drain with care,
Round-robin buffers floating through the air.
Credentials locked safe with mutex's might —
Concurrent uploads done swift and right! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add bounded multipart upload parallelism' accurately and concisely summarizes the primary change—introducing a new bounded parallelism feature for multipart uploads via the max_inflight_parts field.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/tests.cc (1)

711-721: ⚡ Quick win

Increase part count so bounded inflight behavior is actually exercised.

With ~10 MiB input and auto part sizing, this test usually produces only 2 parts. That validates correctness, but it doesn’t stress the producer/drain cap logic for max_inflight_parts (especially 2 and 4). Consider forcing a smaller explicit part size and larger payload so part_count > max_inflight_parts.

Suggested patch
-    // 10MB data -> auto-calc ~5MiB parts = 2 parts, exercising multipart path
-    const size_t data_size = 10 * 1024 * 1024;
+    // Force many parts so inflight cap/drain path is exercised.
+    const size_t part_size = 5 * 1024 * 1024;     // 5 MiB
+    const size_t data_size = (part_size * 5) + 7; // 6 parts total
@@
-      minio::s3::PutObjectArgs args(ss, static_cast<long>(data_size), 0);
+      minio::s3::PutObjectArgs args(ss, static_cast<long>(data_size),
+                                    static_cast<long>(part_size));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/tests.cc` around lines 711 - 721, The test with 10MB data produces only
2 parts due to auto-calculated part sizing, which doesn't adequately stress the
bounded inflight behavior for max_inflight_parts values of 2 and 4. Increase the
data_size constant or specify a smaller explicit part size in the PutObjectArgs
to ensure the part count exceeds the maximum inflight_values being tested
(particularly for the values 2 and 4), so the producer/drain cap logic is
properly exercised across all inflight test cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/client.cc`:
- Line 568: The user-controlled max_inflight_parts value is not properly
constrained before buffer allocation, which can cause memory exhaustion if a
huge value is passed. After the cast assignment of max_inflight_parts from
args.max_inflight_parts on line 568, add clamping logic to constrain
max_inflight_parts to both the known part_count value and a documented maximum
safe limit. This clamping must occur before any reserve() calls or buffer
allocations happen. Apply the same clamping logic to the related code block at
lines 730-744 that also uses max_inflight_parts.
- Line 774: The call to inflight_upload.future.get() at line 774 can throw
std::system_error or other exceptions from the async operation, but these
exceptions are not being caught and converted to error responses as required by
the function's error handling pattern. Wrap the future.get() call in a try-catch
block, catch std::system_error and other relevant exceptions, and convert them
into appropriate error responses instead of allowing them to propagate as
unhandled exceptions. Apply the same exception handling pattern to the similar
future.get() calls mentioned at lines 925-929 to ensure consistent error
handling throughout the function.

---

Nitpick comments:
In `@tests/tests.cc`:
- Around line 711-721: The test with 10MB data produces only 2 parts due to
auto-calculated part sizing, which doesn't adequately stress the bounded
inflight behavior for max_inflight_parts values of 2 and 4. Increase the
data_size constant or specify a smaller explicit part size in the PutObjectArgs
to ensure the part count exceeds the maximum inflight_values being tested
(particularly for the values 2 and 4), so the producer/drain cap logic is
properly exercised across all inflight test cases.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 4f7e0ee8-250d-468d-86aa-9748d3b82b27

📥 Commits

Reviewing files that changed from the base of the PR and between f35873b and 23b0b43.

📒 Files selected for processing (4)
  • include/miniocpp/args.h
  • src/args.cc
  • src/client.cc
  • tests/tests.cc

Comment thread src/client.cc Outdated
Comment thread src/client.cc Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/client.cc`:
- Around line 1077-1221: The multipart upload loop has a data race where
buf_pool[buf_idx] is overwritten at the top of the loop while a previous
UploadPart async task that captured a pointer into the same buffer may still be
running. The capacity drain that waits for the slot to be available happens too
late (after the buffer has already been read into). Move the capacity wait logic
(the if block checking inflight.size() >= max_inflight that pops from inflight
and waits for the future) to the beginning of the loop before the buffer read
operation (before the char* buf = static_cast<char*>(buf_pool[buf_idx].ptr)
line). This ensures we wait for any in-flight tasks using a buffer before
reusing that buffer. After moving the capacity wait to the beginning, remove the
now-redundant capacity wait block that was originally at line 1170-1181.
- Around line 1049-1056: The concurrent multipart upload path does not invoke
the progress callback function (`args.progressfunc`) that is available and used
in the sequential path. To fix this, you need to integrate progress reporting
into the concurrent code path by invoking `args.progressfunc` whenever parts
complete during concurrent uploads. Specifically, add progress callback
invocation in the section where completed parts are being drained (around the
area mentioned at line 1172) to report progress as each part finishes uploading.
Alternatively, if implementing this is not feasible, add explicit documentation
in the code or API documentation clarifying that `progressfunc` callbacks are
not supported when `max_inflight_parts > 1`.

In `@tests/tests.cc`:
- Around line 739-742: The RemoveObject cleanup calls at the PutObject error
path (around line 740) and at lines 770-772 can throw exceptions that overwrite
the original test failure. Wrap these RemoveObject cleanup calls in try-catch
blocks to make them best-effort operations that don't propagate exceptions. This
ensures the original exception from PutObject or other operations is preserved
and not masked by cleanup failures. Apply this pattern consistently to all error
paths where RemoveObject is called for cleanup purposes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: cf3aff48-3fed-4b36-a1cf-059315f701ea

📥 Commits

Reviewing files that changed from the base of the PR and between c1b68bf and 3969066.

📒 Files selected for processing (4)
  • include/miniocpp/args.h
  • src/args.cc
  • src/client.cc
  • tests/tests.cc
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/args.cc
  • include/miniocpp/args.h

Comment thread src/client.cc
Comment thread src/client.cc Outdated
Comment thread tests/tests.cc
refactor

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/tests.cc`:
- Around line 722-723: The comment above the data_size variable declaration
states "Generate 10MB of random data" but the actual allocation is 100 * 1024 *
1024 bytes, which equals 100 MiB. Either update the comment to say "Generate
100MiB of random data" to match the actual allocation, or change the data_size
assignment to 10 * 1024 * 1024 if 10 MiB was the intended size. Ensure the
comment and code are aligned.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b6246c40-53c7-4693-9632-f13999974ae6

📥 Commits

Reviewing files that changed from the base of the PR and between 3969066 and fb3447e.

📒 Files selected for processing (4)
  • include/miniocpp/args.h
  • src/args.cc
  • src/client.cc
  • tests/tests.cc
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/args.cc
  • include/miniocpp/args.h
  • src/client.cc

Comment thread tests/tests.cc Outdated
apply suggestion

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/client.cc (3)

1150-1178: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Preserve single-PUT fallback for unknown-size small streams.

When the first read discovers part_count == 1, this branch still starts multipart upload. That diverges from the sequential path’s BaseClient::PutObject fallback and changes small-object behavior for callers that set max_inflight_parts > 1.

🛠️ Proposed fix
       uploaded_size += part_size;
 
+      if (part_count.has_value() && *part_count == 1) {
+        PutObjectApiArgs api_args;
+        api_args.extra_query_params = args.extra_query_params;
+        api_args.bucket = args.bucket;
+        api_args.region = args.region;
+        api_args.object = args.object;
+        api_args.data = std::string_view(buf, part_size);
+        api_args.buf = buf;
+        api_args.size = part_size;
+        api_args.progressfunc = args.progressfunc;
+        api_args.progress_userdata = args.progress_userdata;
+        api_args.headers = headers;
+        return BaseClient::PutObject(api_args);
+      }
+
       // Create multipart upload on first part.
       if (upload_id.empty()) {

As per PR objectives, this path should handle “single-part fallback to BaseClient::PutObject”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/client.cc` around lines 1150 - 1178, When the first read determines there
is only one part to upload (part_count == 1), the code should fall back to using
BaseClient::PutObject instead of initiating a multipart upload. Add a check
before the CreateMultipartUploadArgs instantiation to test if part_count has a
value and equals 1; if true, execute the BaseClient::PutObject call with the
buffered data and skip the CreateMultipartUpload flow entirely. This ensures
single-part uploads use the optimized path rather than the multipart upload
process.

1098-1105: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Honor progress callback cancellation in the parallel path.

The sequential path aborts when progressfunc returns false; these parallel drains ignore that return value, so caller-requested cancellation continues uploading and can still complete the multipart upload.

🛠️ Proposed fix
           actual_args.uploaded_bytes = completed_bytes;
           actual_args.userdata = args.progress_userdata;
-          args.progressfunc(actual_args);
+          if (!args.progressfunc(actual_args)) {
+            first_err = error::Error("aborted by progress function");
+            break;
+          }
         }
       }
         actual_args.uploaded_bytes = completed_bytes;
         actual_args.userdata = args.progress_userdata;
-        args.progressfunc(actual_args);
+        if (!args.progressfunc(actual_args) && !first_err) {
+          first_err = error::Error("aborted by progress function");
+        }
       }
     }

Also applies to: 1230-1237

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/client.cc` around lines 1098 - 1105, The parallel upload path is ignoring
the return value of the progressfunc callback, which means caller-requested
cancellation (when progressfunc returns false) is not being honored. In the code
block where args.progressfunc(actual_args) is called (around line 1098), capture
the return value instead of discarding it, then check if it returns false and
abort the parallel drain operation if so. Apply the same fix to the second
location mentioned at lines 1230-1237. Reference how the sequential path handles
progressfunc return value cancellation and implement the same logic in both
parallel code sections.

1037-1049: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Enable RDMA per registered buffer, not per pool.

!rdma_regs.empty() only proves at least one buffer registered. If a later buffer registration fails, parts using that buffer still get rdmaclient and can take the RDMA path with an unregistered buffer.

🛠️ Proposed fix
 `#ifdef` MINIO_CPP_RDMA
     // Register each pool buffer for RDMA, mirroring the sequential path.
     cuObjClient& rdma_client = SharedRDMAClient();
     std::vector<ScopedRDMARegistration> rdma_regs;
+    std::vector<bool> rdma_registered(max_inflight, false);
     rdma_regs.reserve(max_inflight);
     bool rdma_connected = rdma_client.isConnected();
     if (rdma_connected) {
       for (unsigned int i = 0; i < max_inflight; i++) {
         char* pool_buf = static_cast<char*>(buf_pool[i].ptr);
         if (rdma_client.cuMemObjGetDescriptor(pool_buf, args.part_size) == 0) {
           rdma_regs.emplace_back(&rdma_client, pool_buf);
+          rdma_registered[i] = true;
         }
       }
     }
 `#endif`
 `#ifdef` MINIO_CPP_RDMA
-      if (!rdma_regs.empty()) {
+      if (rdma_registered[buf_idx]) {
         up_args.rdmaclient = &rdma_client;
       }

Also applies to: 1191-1194

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/client.cc` around lines 1037 - 1049, The RDMA registration logic
currently checks if any buffer was registered using !rdma_regs.empty(), but this
is insufficient because a buffer could fail registration while others succeed.
Instead of making the RDMA decision at the pool level, track registration status
per individual buffer by creating a data structure (such as a bool vector
aligned with buf_pool indices) that records which specific buffers were
successfully registered by cuMemObjGetDescriptor(). When determining whether to
use the RDMA path for any operation, check the registration status of the
specific buffer being used rather than checking if the rdma_regs vector is
non-empty. This ensures unregistered buffers never attempt the RDMA path even if
other buffers in the pool are registered.
🧹 Nitpick comments (1)
src/client.cc (1)

1026-1026: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Trim comments that only restate the next statement.

These headings are redundant with the code immediately below them; keep the non-obvious lifetime/RDMA/race comments.

As per coding guidelines, **/*.{h,cc,cpp,hpp}: “Do not add obvious comments (e.g., 'Safe: Validate() ensures X has value')”.

Also applies to: 1161-1161, 1181-1181, 1222-1222, 1253-1253

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/client.cc` at line 1026, Remove redundant comments that only restate what
the code immediately below them does. In src/client.cc, delete the obvious
comments at lines 1026, 1161, 1181, 1222, and 1253 that simply describe the code
statement (like "Allocate pool of page-aligned buffers"). Keep only comments
that provide non-obvious information about lifetime semantics, RDMA behavior,
race conditions, or other implementation details that aren't immediately
apparent from the code itself.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/client.cc`:
- Around line 1150-1178: When the first read determines there is only one part
to upload (part_count == 1), the code should fall back to using
BaseClient::PutObject instead of initiating a multipart upload. Add a check
before the CreateMultipartUploadArgs instantiation to test if part_count has a
value and equals 1; if true, execute the BaseClient::PutObject call with the
buffered data and skip the CreateMultipartUpload flow entirely. This ensures
single-part uploads use the optimized path rather than the multipart upload
process.
- Around line 1098-1105: The parallel upload path is ignoring the return value
of the progressfunc callback, which means caller-requested cancellation (when
progressfunc returns false) is not being honored. In the code block where
args.progressfunc(actual_args) is called (around line 1098), capture the return
value instead of discarding it, then check if it returns false and abort the
parallel drain operation if so. Apply the same fix to the second location
mentioned at lines 1230-1237. Reference how the sequential path handles
progressfunc return value cancellation and implement the same logic in both
parallel code sections.
- Around line 1037-1049: The RDMA registration logic currently checks if any
buffer was registered using !rdma_regs.empty(), but this is insufficient because
a buffer could fail registration while others succeed. Instead of making the
RDMA decision at the pool level, track registration status per individual buffer
by creating a data structure (such as a bool vector aligned with buf_pool
indices) that records which specific buffers were successfully registered by
cuMemObjGetDescriptor(). When determining whether to use the RDMA path for any
operation, check the registration status of the specific buffer being used
rather than checking if the rdma_regs vector is non-empty. This ensures
unregistered buffers never attempt the RDMA path even if other buffers in the
pool are registered.

---

Nitpick comments:
In `@src/client.cc`:
- Line 1026: Remove redundant comments that only restate what the code
immediately below them does. In src/client.cc, delete the obvious comments at
lines 1026, 1161, 1181, 1222, and 1253 that simply describe the code statement
(like "Allocate pool of page-aligned buffers"). Keep only comments that provide
non-obvious information about lifetime semantics, RDMA behavior, race
conditions, or other implementation details that aren't immediately apparent
from the code itself.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 23067705-c006-4284-b2e2-0cb5836c1752

📥 Commits

Reviewing files that changed from the base of the PR and between fb3447e and 5264d08.

📒 Files selected for processing (2)
  • src/client.cc
  • tests/tests.cc
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/tests.cc

@jiuker

jiuker commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

@jiuker Sure, I'll review the PR!

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/client.cc (1)

1233-1234: 🩺 Stability & Availability | 🔴 Critical

Fix race condition in concurrent credential fetching on shared provider.

Multiple concurrent UploadPart threads invoke provider->Fetch() without synchronization—both through the HTTP path (ToHttpRequestBuildHeadersprovider->Fetch() at request.cc:350) and the RDMA path (rdmaPutprovider->Fetch() at rdma.h:132). ChainedProvider::Fetch() mutates creds_ and provider_ members without locks (providers.cc:78–95), creating data races when concurrent threads call Fetch() simultaneously. This can corrupt credentials, cause unsigned requests, or produce invalid signatures.

Protect the provider's credential fetching with a mutex, or ensure credentials are fetched once per request in a thread-safe manner before dispatching to std::async.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/client.cc` around lines 1233 - 1234, Multiple concurrent UploadPart
threads spawned via std::async are invoking provider->Fetch() without
synchronization, causing data races when ChainedProvider::Fetch() mutates creds_
and provider_ members. Fix this by fetching credentials in a thread-safe manner
before dispatching the std::async call in the lambda, or by adding mutex
protection around the provider's Fetch() method calls. Ensure that either
credentials are obtained once per request prior to creating the async task, or
that concurrent access to the provider's internal state is serialized with a
lock.
♻️ Duplicate comments (2)
src/client.cc (2)

1021-1035: 🩺 Stability & Availability | 🟠 Major | 🏗️ Heavy lift

Clamp max_inflight_parts before allocating the buffer pool.

max_inflight is taken directly from the user-controlled args.max_inflight_parts (validated only against 0) and one full part buffer is allocated per slot at Lines 1028-1035. A small upload paired with an accidentally huge value can exhaust memory before any useful work begins. Clamp to the known part_count and to a documented maximum before allocating.

🛡️ Proposed guard
   if (args.max_inflight_parts > 1) {
     unsigned int max_inflight = args.max_inflight_parts;
+    if (args.part_count.has_value() &&
+        max_inflight > static_cast<unsigned int>(*args.part_count)) {
+      max_inflight = static_cast<unsigned int>(*args.part_count);
+    }
+    if (max_inflight > static_cast<unsigned int>(utils::kMaxMultipartCount)) {
+      max_inflight = static_cast<unsigned int>(utils::kMaxMultipartCount);
+    }
     size_t alloc_size =
         (args.part_count > 0) ? args.part_size : (args.part_size + 1);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/client.cc` around lines 1021 - 1035, The max_inflight variable derived
from user-controlled args.max_inflight_parts is used directly to allocate
buffers without proper constraints, which can lead to memory exhaustion. Before
the buffer allocation loop that calls AlignedAlloc, add validation to clamp the
max_inflight value to a reasonable maximum and to args.part_count (if part_count
is greater than zero) to prevent excessive memory allocation from untrusted
input.

1233-1234: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Convert std::async/future.get() exceptions into PutObjectResponse errors.

std::async(std::launch::async, …) can throw std::system_error when a thread cannot be started, and future.get() (Lines 1090 and 1244) rethrows any exception from UploadPart. These would escape as thrown exceptions and bypass the function's error-response contract; wrap the dispatch and the get() calls so failures become first_err/error responses.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/client.cc` around lines 1233 - 1234, The std::async call that creates
ip.future and the subsequent future.get() calls at lines 1090 and 1244 can throw
exceptions (std::system_error from thread creation, or exceptions rethrown from
UploadPart) that escape as uncaught exceptions rather than being converted to
error responses. Wrap the std::async dispatch in a try-catch block to handle
std::system_error and set first_err appropriately, and similarly wrap both
future.get() calls in try-catch blocks to catch and convert any exceptions from
UploadPart into error responses instead of letting them propagate.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/client.cc`:
- Around line 1233-1234: Multiple concurrent UploadPart threads spawned via
std::async are invoking provider->Fetch() without synchronization, causing data
races when ChainedProvider::Fetch() mutates creds_ and provider_ members. Fix
this by fetching credentials in a thread-safe manner before dispatching the
std::async call in the lambda, or by adding mutex protection around the
provider's Fetch() method calls. Ensure that either credentials are obtained
once per request prior to creating the async task, or that concurrent access to
the provider's internal state is serialized with a lock.

---

Duplicate comments:
In `@src/client.cc`:
- Around line 1021-1035: The max_inflight variable derived from user-controlled
args.max_inflight_parts is used directly to allocate buffers without proper
constraints, which can lead to memory exhaustion. Before the buffer allocation
loop that calls AlignedAlloc, add validation to clamp the max_inflight value to
a reasonable maximum and to args.part_count (if part_count is greater than zero)
to prevent excessive memory allocation from untrusted input.
- Around line 1233-1234: The std::async call that creates ip.future and the
subsequent future.get() calls at lines 1090 and 1244 can throw exceptions
(std::system_error from thread creation, or exceptions rethrown from UploadPart)
that escape as uncaught exceptions rather than being converted to error
responses. Wrap the std::async dispatch in a try-catch block to handle
std::system_error and set first_err appropriately, and similarly wrap both
future.get() calls in try-catch blocks to catch and convert any exceptions from
UploadPart into error responses instead of letting them propagate.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 09ee2e6e-234d-42b6-bbe0-12bf3f7aba8d

📥 Commits

Reviewing files that changed from the base of the PR and between fb3447e and e685a61.

📒 Files selected for processing (2)
  • src/client.cc
  • tests/tests.cc
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/tests.cc

@jiuker

jiuker commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

@jiuker Sure, I'll review the PR!

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/client.cc (1)

1090-1090: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Use std::optional for the unset upload-speed state.

upload_speed = -1 models “not observed yet”; prefer std::optional<double> here, or remove the field in this branch if concurrent uploads do not report speed.

As per coding guidelines, **/*.{h,cc,cpp,hpp} should “Use std::optional<T> for values that may be uninitialized.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/client.cc` at line 1090, The upload_speed variable is currently
initialized to -1 to represent an unobserved state, which relies on magic values
and doesn't clearly express intent. Replace the double upload_speed variable
declaration with std::optional<double> upload_speed, then update all code that
checks or uses upload_speed to work with the optional type (checking if it has a
value before dereferencing, and initializing it without a value when unset).
This aligns with the coding guidelines requiring std::optional for potentially
uninitialized values.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/client.cc`:
- Around line 1290-1301: The code unconditionally appends a Part to the parts
vector after the try-catch block in the final drain, without validating that the
UploadPartResponse is successful. After the try-catch block that handles
exceptions and sets first_err, add a check to verify the response is valid
before calling parts.push_back and report_progress, similar to how the capacity
drain handles invalid responses. If the response indicates failure, set
first_err and skip appending the part to prevent a failed upload from being
incorrectly reported as completed progress.

---

Nitpick comments:
In `@src/client.cc`:
- Line 1090: The upload_speed variable is currently initialized to -1 to
represent an unobserved state, which relies on magic values and doesn't clearly
express intent. Replace the double upload_speed variable declaration with
std::optional<double> upload_speed, then update all code that checks or uses
upload_speed to work with the optional type (checking if it has a value before
dereferencing, and initializing it without a value when unset). This aligns with
the coding guidelines requiring std::optional for potentially uninitialized
values.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: caf53dec-4df1-45fd-9a96-8dded270a04c

📥 Commits

Reviewing files that changed from the base of the PR and between e685a61 and 6111b69.

📒 Files selected for processing (3)
  • include/miniocpp/providers.h
  • src/client.cc
  • src/providers.cc

Comment thread src/client.cc
@jiuker

jiuker commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

@jiuker Sure, I'll review the PR!

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: parallel multipart UploadPart

2 participants