Skip to content

csrr minstret exposes a one-instruction-stale retired count #263

Description

@jf-cc727

Bug Description

NutShell returns a minstret value that omits the immediately preceding retired instruction. In the shortest self-contained reproducer, a warm-up read of minstret is followed by two visible reads. Architecturally, those two visible reads should differ by one because the first visible CSR read retires normally. Spike returns s0=1, s1=2; NutShell returns s0=1, s1=1.

csrr zero, minstret
csrr s0, minstret
csrr s1, minstret

This avoids depending on reset value or a software write to minstret. It isolates next-instruction visibility of the retired-instruction count.

Additional control experiments show that the issue is broader than a preceding csrr minstret. Replacing the immediately preceding instruction with nop or csrr mcycle still produces the same off-by-one mismatch (right = 6, wrong = 5), and inserting an extra retired instruction shifts the mismatch to right = 7, wrong = 6. The observed behavior is therefore that csrr minstret exposes a count that generally lags one retired instruction behind the architectural value.

RISC-V Specification Requirement

Zicsr states that a CSR read observes the value before that instruction executes, and CSR state changes are observed in program order by subsequent instructions. minstret counts retired instructions, so a csrr minstret must observe all architecturally older retired instructions while excluding only its own retirement. Therefore the value read by csrr minstret must already include the retirement of the immediately preceding instruction, whether that preceding instruction is another CSR read or an unrelated retired instruction.

Reference: https://docs.riscv.org/reference/isa/v20260120/unpriv/zicsr.html#_csr_access_ordering

Steps to Reproduce

Run the supplied poc/program.elf under difftest. Compare the two visible reads at PCs 0x80000004 and 0x80000008. This is the minimal reproducer; control experiments with nop and csrr mcycle before csrr minstret show the same one-instruction lag.

Core source sequence (warm-up read, two visible reads, and delta check):

csrr zero, minstret
csrr s0, minstret
csrr s1, minstret
sub  t0, s1, s0
li   t1, 1
bne  t0, t1, fail_bad_delta

Expected Result

(s1 - s0) == 1; in the supplied reference trace, s0=1, s1=2. More generally, a csrr minstret should include the retirement of the immediately preceding instruction.

Actual Result

NutShell returns s0=1, s1=1 in the minimal reproducer:

[01] csrr s0, minstret ... data 1
[02] csrr s1, minstret ... data 1
REF s0=1 s1=2
s1 different ... right=2, wrong=1

Control experiments show the same lag with other preceding instructions:

  • preceding nop then csrr minstret: right = 6, wrong = 5
  • preceding csrr mcycle then csrr minstret: right = 6, wrong = 5
  • one additional retired instruction before the read: right = 7, wrong = 6
Image

NS-4.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions