Skip to content

fix(test): unblock root pixi run test workflow#1978

Merged
rparolin merged 6 commits into
NVIDIA:mainfrom
rparolin:fix-pixi-run-test
May 5, 2026
Merged

fix(test): unblock root pixi run test workflow#1978
rparolin merged 6 commits into
NVIDIA:mainfrom
rparolin:fix-pixi-run-test

Conversation

@rparolin

@rparolin rparolin commented Apr 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

pixi run test was failing end-to-end. Two stacked root causes:

  1. List-form pixi cmd arrays didn't expand $PIXI_ENVIRONMENT_NAME. The root test-* tasks dropped into each sub-package without forwarding the active environment, so the inner pixi run picked the sub-package's default environment. In cuda_bindings/pixi.toml and cuda_core/pixi.toml, default has no cuda-version pin — the conda solver resolved a CUDA-12.x toolkit, and the build failed compiling Cython output that references CUatomicOperation_enum (a CUDA-13.x-only symbol). Fix: wrap each test-* task in bash -c '…' so the shell expands $PIXI_ENVIRONMENT_NAME and forwards it explicitly via -e to the inner pixi run.

  2. Cython couldn't find cross-package .pxd files under pixi-build's editable install. pixi-build installs cuda-bindings via a PEP 660 finder hook; runtime imports honor it but Cython's filesystem .pxd resolver does not. Fix: replace the cythonize CLI invocation in both cuda_core/tests/cython/build_tests.sh and cuda_bindings/tests/cython/build_tests.sh with a small Python wrapper (build_tests.py) that calls Cython.Build.cythonize() with an explicit include_path resolved at runtime from cuda.bindings.__file__. This avoids platform-specific PYTHONPATH separator handling and surfaces missing-import failures as Python exceptions instead of silent fallbacks.

Files changed

  • pixi.toml — three test-* tasks switched to bash -c '…' form forwarding -e "$PIXI_ENVIRONMENT_NAME"; comment block expanded with rationale and Linux-only scope note.
  • cuda_core/tests/cython/build_tests.sh — adds set -euo pipefail; invokes new build_tests.py instead of the cythonize CLI; still owns CPLUS_INCLUDE_PATH / CL setup.
  • cuda_core/tests/cython/build_tests.pynew; resolves the cuda namespace package's parent dir at runtime and runs cythonize() + setup(... build_ext --inplace) with an explicit include_path.
  • cuda_bindings/tests/cython/build_tests.sh — adds set -euo pipefail; switches to the new wrapper; preserves CPLUS_INCLUDE_PATH / CL setup.
  • cuda_bindings/tests/cython/build_tests.pynew; mirror of the cuda_core wrapper, with nthreads=1 to preserve the previous -j 1 deterministic-build behavior.

🤖 Generated with Claude Code

`pixi run test` failed at the cuda_bindings build stage because list-form
pixi `cmd` arrays didn't expand `$PIXI_ENVIRONMENT_NAME` reliably, so the
inner per-package `pixi run` calls picked the cuda_bindings default
environment (no cuda-version pin). The conda solver then resolved
cuda-version=12.9 and the build failed with a missing `CUatomicOperation_enum`
(a CUDA-13.x-only symbol).

Wrap the three test-* tasks in `bash -c '...'` so the shell expands
`$PIXI_ENVIRONMENT_NAME` and forward it explicitly via `-e` to each inner
pixi run.

Once the bindings build was unblocked, cuda_core's cython test build hit a
second issue: `cythonize` cannot resolve `cimport cuda.bindings.*` against
pixi-build's editable install, which exposes the cuda namespace package via a
finder hook that Cython's filesystem .pxd resolver does not consult. Replace
the `cythonize` CLI invocation with a small Python wrapper that calls
`Cython.Build.cythonize()` with an explicit `include_path` resolved from the
imported `cuda.bindings` package.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rparolin rparolin added this to the cuda.core v1.0.0 milestone Apr 25, 2026
@rparolin rparolin added bug Something isn't working CI/CD CI/CD infrastructure cuda.core Everything related to the cuda.core module labels Apr 25, 2026
@rparolin rparolin requested a review from cpcloud April 25, 2026 08:13
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rparolin rparolin self-assigned this Apr 25, 2026
Drop cuda_core/tests/cython/build_tests.py in favor of a small PYTHONPATH
shim in build_tests.sh. Same outcome (Cython's .pxd resolver finds
cuda.bindings via the package's parent directory), three lines instead of
a separate setuptools entry point.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

This comment has been minimized.

@leofang leofang added the P1 Medium priority - Should do label May 1, 2026
Replace the PYTHONPATH shim in cuda_core/tests/cython/build_tests.sh
with a small Python driver (build_tests.py) that calls
Cython.Build.cythonize() with an explicit include_path resolved at
runtime from cuda.bindings.__file__. Avoids platform-specific
PYTHONPATH separator handling and surfaces missing-import failures
as Python exceptions instead of silent fallbacks.

Apply the same wrapper pattern to cuda_bindings/tests/cython/ for
symmetry; both shell scripts gain `set -eo pipefail` and `${VAR:-}`
defaults so the previously optional CPLUS_INCLUDE_PATH / CL env vars
keep working under stricter error mode.

Expand the pixi.toml comment block to document why each test-* task
is wrapped in `bash -c '...'` and note Linux-only scope.

Verified end-to-end: `pixi run test` passes 4,314 tests
(974 pathfinder + 384 bindings + 2,956 core), 208 skipped, 2 xfailed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the cuda.bindings Everything related to the cuda.bindings module label May 1, 2026
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rparolin rparolin enabled auto-merge (squash) May 1, 2026 21:23
@rparolin rparolin requested a review from isVoid May 1, 2026 23:05
@rparolin

rparolin commented May 1, 2026

Copy link
Copy Markdown
Collaborator Author

Code review

No issues found. Checked for bugs and AGENTS.md compliance.

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

@rparolin rparolin removed cuda.bindings Everything related to the cuda.bindings module cuda.core Everything related to the cuda.core module labels May 2, 2026

@isVoid isVoid left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Both fixes make sense to me.

@rparolin rparolin merged commit 7bd6397 into NVIDIA:main May 5, 2026
182 of 184 checks passed
@github-actions

github-actions Bot commented May 5, 2026

Copy link
Copy Markdown
Doc Preview CI
Preview removed because the pull request was closed or merged.

rparolin added a commit that referenced this pull request Jun 9, 2026
* Fix cuda-version pin lag breaking local pixi run test

The bindings were regenerated against CUDA 13.3.0 (cc50515), adding NVRTC
symbols (NVRTC_ERROR_BUSY, nvrtcBundledHeadersInfo, nvrtcGetBundledHeadersInfo),
but the pixi cuda-version pins stayed at 13.2 in cuda_bindings/pixi.toml and
cuda_core/pixi.toml. `pixi run test` then built 13.3-referencing Cython code
against a 13.2 nvrtc.h and failed with "'nvrtcBundledHeadersInfo' was not
declared in this scope". CI was unaffected because it builds wheels from
ci/versions.yml (13.3.0) rather than via pixi run test.

Bump the cuda-version pins (build-variants + feature.cu13) from 13.2.* to
13.3.* in both packages so the local toolkit matches the regenerated sources
and ci/versions.yml. Re-solved pixi.lock files accordingly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Place cython-test .so in tests/cython regardless of cwd

tests/cython/build_tests.py runs `build_ext --inplace`, which writes the
compiled .so relative to the current working directory. pixi runs the
build-cython-tests task from the project root, so the .so landed in the
package root instead of tests/cython/, where pytest imports it by bare module
name. The test only passed previously because a correctly-placed .so from an
earlier build persisted (gitignored); a clean checkout fails with
ModuleNotFoundError.

chdir to the script directory before build_ext --inplace so the .so lands next
to its .pyx in both cuda_bindings and cuda_core (kept aligned per #1978).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
rparolin added a commit that referenced this pull request Jun 9, 2026
…2185)

* Fix cuda-version pin lag breaking local pixi run test

The bindings were regenerated against CUDA 13.3.0 (cc50515), adding NVRTC
symbols (NVRTC_ERROR_BUSY, nvrtcBundledHeadersInfo, nvrtcGetBundledHeadersInfo),
but the pixi cuda-version pins stayed at 13.2 in cuda_bindings/pixi.toml and
cuda_core/pixi.toml. `pixi run test` then built 13.3-referencing Cython code
against a 13.2 nvrtc.h and failed with "'nvrtcBundledHeadersInfo' was not
declared in this scope". CI was unaffected because it builds wheels from
ci/versions.yml (13.3.0) rather than via pixi run test.

Bump the cuda-version pins (build-variants + feature.cu13) from 13.2.* to
13.3.* in both packages so the local toolkit matches the regenerated sources
and ci/versions.yml. Re-solved pixi.lock files accordingly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Place cython-test .so in tests/cython regardless of cwd

tests/cython/build_tests.py runs `build_ext --inplace`, which writes the
compiled .so relative to the current working directory. pixi runs the
build-cython-tests task from the project root, so the .so landed in the
package root instead of tests/cython/, where pytest imports it by bare module
name. The test only passed previously because a correctly-placed .so from an
earlier build persisted (gitignored); a clean checkout fails with
ModuleNotFoundError.

chdir to the script directory before build_ext --inplace so the .so lands next
to its .pyx in both cuda_bindings and cuda_core (kept aligned per #1978).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* ci: validate pixi run test source-build (PR smoke + nightly GPU)

Main CI tests prebuilt wheels and never exercises the pixi source build, so
that developer path rots silently on CUDA-pin / generated-source / conda-forge
/ cython-build drift (#2182, #2183).

Add a workflow that runs the pixi source build:
- build-smoke (PRs touching the at-risk files): CPU-only. Source-builds
  bindings + core, imports them, builds the cython test extensions and checks
  placement. Catches the compile / ABI / .so-placement regressions without a GPU.
- full-test (nightly + manual): GPU runner, full `pixi run test`.

Shared pixi install factored into a composite action with an explicit,
asserted version pin.

Relates to #2183 (validate the source-build path over time); the regressions
this guards against are #2182, fixed by #2180.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* ci: declare linux-amd64-gpu-l4-latest-1 self-hosted runner label for actionlint

actionlint validates static runner labels against its known set; the new
full-test job uses a literal GPU label (existing GPU jobs dodge this by
building the label from a matrix expression). Declare it so pre-commit's
actionlint hook passes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* ci: checkout full history + tags so setuptools-scm derives the real version

A shallow checkout has no tags, so the source-built packages get
setuptools-scm's 0.1.dev1 fallback. cuda.core's import-time guard then
rejects cuda.bindings ("12.x or 13.x must be installed"). Use fetch-depth: 0
in both jobs so the build resolves the real 13.x version.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* ci: use pinned prefix-dev/setup-pixi instead of curl|bash installer

Addresses review (@mdboom): the composite action shelled out to
`curl -fsSL https://pixi.sh/install.sh | bash`, an unverified installer
(the codecov.io supply-chain failure mode). Replace it with
prefix-dev/setup-pixi pinned to a commit SHA (v0.9.6) — its install logic
is auditable and pinned — and delete the composite action file.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* ci: switch workflow to pinned prefix-dev/setup-pixi (fixup)

The prior commit only removed the composite action file; this commits the
workflow change that actually uses prefix-dev/setup-pixi@<sha> in both jobs
(and drops the now-unneeded curl from the container apt install). Without
this the workflow referenced the deleted ./.github/actions/setup-pixi.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working CI/CD CI/CD infrastructure P1 Medium priority - Should do

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants