Feature or enhancement
Proposal:
The RecursionError-raising tests in test_copy are showing a performance bottleneck when recursively acquiring the same PyCriticalSection2 when comparing two lists or dicts. The full suite takes ~50ms on GIL-enabled builds and ~2.5s on free-threading builds.
On main with plain ./configure:
$ ./python.exe -m test -w test_copy
Using random seed: 948858056
0:00:00 load avg: 6.78 mem: 26.9 MiB Run 1 test sequentially in a single process
0:00:00 load avg: 6.78 mem: 26.9 MiB [1/1] test_copy
0:00:00 load avg: 6.78 mem: 43.4 MiB [1/1] test_copy passed
== Tests result: SUCCESS ==
1 test OK.
Total duration: 47 ms
Total tests: run=83
Total test files: run=1/1
Result: SUCCESS
On main with ./configure --disable-gil:
$ ./python.exe -m test -w test_copy
Using random seed: 739047776
0:00:00 load avg: 6.07 mem: 32.1 MiB Run 1 test sequentially in a single process
0:00:00 load avg: 6.07 mem: 32.1 MiB [1/1] test_copy
0:00:02 load avg: 6.07 mem: 48.4 MiB [1/1] test_copy passed
== Tests result: SUCCESS ==
1 test OK.
Total duration: 2.7 sec
Total tests: run=83
Total test files: run=1/1
Result: SUCCESS
We should implement the same optimization for recursive critical sections which is already implemented for the single-mutex PyCriticalSection here. A quick code change showed test_copy completing in ~50ms again.
This became more evident in #149784 where test_copy takes ~1m30s.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Feature or enhancement
Proposal:
The
RecursionError-raising tests intest_copyare showing a performance bottleneck when recursively acquiring the samePyCriticalSection2when comparing two lists or dicts. The full suite takes ~50ms on GIL-enabled builds and ~2.5s on free-threading builds.On
mainwith plain./configure:On
mainwith./configure --disable-gil:We should implement the same optimization for recursive critical sections which is already implemented for the single-mutex PyCriticalSection here. A quick code change showed
test_copycompleting in ~50ms again.This became more evident in #149784 where
test_copytakes ~1m30s.Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response