Bug Description
NutShell returns a minstret value that omits the immediately preceding retired instruction. In the shortest self-contained reproducer, a warm-up read of minstret is followed by two visible reads. Architecturally, those two visible reads should differ by one because the first visible CSR read retires normally. Spike returns s0=1, s1=2; NutShell returns s0=1, s1=1.
csrr zero, minstret
csrr s0, minstret
csrr s1, minstret
This avoids depending on reset value or a software write to minstret. It isolates next-instruction visibility of the retired-instruction count.
Additional control experiments show that the issue is broader than a preceding csrr minstret. Replacing the immediately preceding instruction with nop or csrr mcycle still produces the same off-by-one mismatch (right = 6, wrong = 5), and inserting an extra retired instruction shifts the mismatch to right = 7, wrong = 6. The observed behavior is therefore that csrr minstret exposes a count that generally lags one retired instruction behind the architectural value.
RISC-V Specification Requirement
Zicsr states that a CSR read observes the value before that instruction executes, and CSR state changes are observed in program order by subsequent instructions. minstret counts retired instructions, so a csrr minstret must observe all architecturally older retired instructions while excluding only its own retirement. Therefore the value read by csrr minstret must already include the retirement of the immediately preceding instruction, whether that preceding instruction is another CSR read or an unrelated retired instruction.
Reference: https://docs.riscv.org/reference/isa/v20260120/unpriv/zicsr.html#_csr_access_ordering
Steps to Reproduce
Run the supplied poc/program.elf under difftest. Compare the two visible reads at PCs 0x80000004 and 0x80000008. This is the minimal reproducer; control experiments with nop and csrr mcycle before csrr minstret show the same one-instruction lag.
Core source sequence (warm-up read, two visible reads, and delta check):
csrr zero, minstret
csrr s0, minstret
csrr s1, minstret
sub t0, s1, s0
li t1, 1
bne t0, t1, fail_bad_delta
Expected Result
(s1 - s0) == 1; in the supplied reference trace, s0=1, s1=2. More generally, a csrr minstret should include the retirement of the immediately preceding instruction.
Actual Result
NutShell returns s0=1, s1=1 in the minimal reproducer:
[01] csrr s0, minstret ... data 1
[02] csrr s1, minstret ... data 1
REF s0=1 s1=2
s1 different ... right=2, wrong=1
Control experiments show the same lag with other preceding instructions:
- preceding
nop then csrr minstret: right = 6, wrong = 5
- preceding
csrr mcycle then csrr minstret: right = 6, wrong = 5
- one additional retired instruction before the read:
right = 7, wrong = 6
NS-4.zip
Bug Description
NutShell returns a
minstretvalue that omits the immediately preceding retired instruction. In the shortest self-contained reproducer, a warm-up read ofminstretis followed by two visible reads. Architecturally, those two visible reads should differ by one because the first visible CSR read retires normally. Spike returnss0=1,s1=2; NutShell returnss0=1,s1=1.This avoids depending on reset value or a software write to
minstret. It isolates next-instruction visibility of the retired-instruction count.Additional control experiments show that the issue is broader than a preceding
csrr minstret. Replacing the immediately preceding instruction withnoporcsrr mcyclestill produces the same off-by-one mismatch (right = 6,wrong = 5), and inserting an extra retired instruction shifts the mismatch toright = 7,wrong = 6. The observed behavior is therefore thatcsrr minstretexposes a count that generally lags one retired instruction behind the architectural value.RISC-V Specification Requirement
Zicsr states that a CSR read observes the value before that instruction executes, and CSR state changes are observed in program order by subsequent instructions.
minstretcounts retired instructions, so acsrr minstretmust observe all architecturally older retired instructions while excluding only its own retirement. Therefore the value read bycsrr minstretmust already include the retirement of the immediately preceding instruction, whether that preceding instruction is another CSR read or an unrelated retired instruction.Reference: https://docs.riscv.org/reference/isa/v20260120/unpriv/zicsr.html#_csr_access_ordering
Steps to Reproduce
Run the supplied
poc/program.elfunder difftest. Compare the two visible reads at PCs0x80000004and0x80000008. This is the minimal reproducer; control experiments withnopandcsrr mcyclebeforecsrr minstretshow the same one-instruction lag.Core source sequence (warm-up read, two visible reads, and delta check):
Expected Result
(s1 - s0) == 1; in the supplied reference trace,s0=1,s1=2. More generally, acsrr minstretshould include the retirement of the immediately preceding instruction.Actual Result
NutShell returns
s0=1,s1=1in the minimal reproducer:Control experiments show the same lag with other preceding instructions:
nopthencsrr minstret:right = 6,wrong = 5csrr mcyclethencsrr minstret:right = 6,wrong = 5right = 7,wrong = 6NS-4.zip