Skip to content

feat(rest): vend storage credentials into per-table FileIO#729

Open
huan233usc wants to merge 1 commit into
apache:mainfrom
huan233usc:feat/rest-storage-credential-vending
Open

feat(rest): vend storage credentials into per-table FileIO#729
huan233usc wants to merge 1 commit into
apache:mainfrom
huan233usc:feat/rest-storage-credential-vending

Conversation

@huan233usc

Copy link
Copy Markdown

What & why

The REST catalog already declares the .../namespaces/{ns}/tables/{tbl}/credentials
endpoint path, but LoadTableResult discards the storage-credentials the server
returns. As a result, catalogs that vend per-table storage credentials cannot be used
end to end — the client has no credentials to read/write the table's data files.

This wires up vended storage credentials and, in doing so, resolves the existing
FIXME: support per-table FileIO creation in RestCatalog::LoadTable.

What changed

  • StorageCredential type (prefix + config) and a parsed
    LoadTableResult.storage_credentials list, with JSON round-trip support in
    catalog/rest/json_serde.cc.
  • MakeTableFileIO (catalog/rest/rest_file_io.cc): builds a table-scoped FileIO by
    layering, in increasing precedence, the catalog configuration → the table's config
    the most-specific (longest-prefix) vended credential, following the REST spec guidance
    to "choose the most specific prefix". When the table supplies no overrides, the shared
    catalog FileIO instance is reused. The existing MakeCatalogFileIO is untouched, so
    there is no behavior change for non-vending servers.
  • MatchStorageCredential: longest-prefix selection helper.
  • Wired through LoadTable, CreateTable, RegisterTable, and StageCreateTable.
    (UpdateTable is intentionally left alone — its CommitTableResponse does not carry
    storage credentials in the spec.)

Known limitations / follow-ups

Surfaced during self-review; all are pre-existing structural gaps that this PR narrows
but does not fully close. Happy to file issues / follow up on these:

  1. Commit path: UpdateTable still returns a Table bound to the shared catalog
    FileIO (the spec's CommitTableResponse carries no credentials), so a table obtained
    from a commit loses vended credentials that the equivalent LoadTable would have.
  2. Multi-prefix tables: one credential is selected against the table's base location
    and baked into a single FileIO. Tables whose data/metadata span multiple credential
    prefixes (e.g. split write.data.path / write.metadata.path) would need per-path
    credential routing (Java routes per-prefix inside S3FileIO).
  3. Credential expiry/refresh: vended credentials are typically short-lived STS
    tokens; there is no refresh hook yet. The .../credentials endpoint
    (Endpoint::TableCredentials()) exists and is the natural follow-up for refresh.
  4. No FileIO caching: a new FileIO is constructed per load when the response carries
    table config or credentials; a cache keyed by merged properties would avoid repeated
    S3 client construction.
  5. The client does not yet send X-Iceberg-Access-Delegation: vended-credentials by
    default (Java does); servers that gate vending on that header require it to be set
    via the header. catalog property for now.

Testing

  • Full REST unit suite passing (rest_catalog_test, 284 tests).
  • New cases:
    • serde round-trip and deserialize of storage-credentials (incl. multiple
      credentials);
    • MatchStorageCredential longest-prefix / no-match;
    • MakeTableFileIO reuses the catalog FileIO when there are no overrides, and applies
      the most-specific vended credential over both the catalog and table config.
  • Manually verified against a credential-vending REST catalog: LoadTable parses the
    vended credentials and MakeTableFileIO selects and applies them when constructing
    the per-table FileIO.

This pull request and its description were written by Isaac.

The REST catalog declared the `.../credentials` endpoint path but
`LoadTableResult` discarded the `storage-credentials` the server returns,
so credential-vending catalogs could not access table data end to end.

- Add a `StorageCredential` type and parse the `storage-credentials` list
  on `LoadTableResult` (load/create/register/stage responses), with JSON
  round-trip support.
- Add `MakeTableFileIO`, which builds a table-scoped FileIO by layering the
  most-specific (longest-prefix) vended credential and the table's `config`
  on top of the catalog configuration, reusing the shared catalog FileIO
  when there are no overrides. This also resolves the per-table FileIO
  FIXME in `RestCatalog::LoadTable`.
- Wire it through LoadTable / CreateTable / RegisterTable / StageCreateTable.
- Unit tests for serde and for credential selection / FileIO layering.
@huan233usc huan233usc force-pushed the feat/rest-storage-credential-vending branch from 586c27c to 668aaa4 Compare June 11, 2026 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant