fix(error-reporting): silence ResolutionError to prevent crash reports#1115
fix(error-reporting): silence ResolutionError to prevent crash reports#1115sentry[bot] wants to merge 3 commits into
Conversation
|
BYK
left a comment
There was a problem hiding this comment.
We want to keep these as users don't randomly type bogus event ids.
There was a problem hiding this comment.
CI is failing because the tests weren't updated to match the new behavior. Two test failures:
-
classifySilenced > does NOT silence ResolutionError(line 250) — thetest.eachblock explicitly assertsclassifySilenced(new ResolutionError(...))returnsnull. Need to moveResolutionErrorout of that block and add a dedicated test asserting it returns"user_input_error". -
reportCliError integration > captures ResolutionError(line 450) — assertscaptureExceptionis called forResolutionError, but silenced errors skip capture. This test should instead assertcaptureExceptionis NOT called and the silenced metric IS emitted with reason"user_input_error".
Also, the module-level JSDoc (line 6) lists only OutputError, AuthError, and ApiError as silenced — should mention ResolutionError now too.
|
@jared-outpost fix all these issues |
Codecov Results 📊✅ Patch coverage is 100.00%. Project has 5048 uncovered lines. Coverage diff@@ Coverage Diff @@
## main #PR +/-##
==========================================
+ Coverage 81.35% 81.36% +0.01%
==========================================
Files 392 392 —
Lines 27070 27075 +5
Branches 17566 17575 +9
==========================================
+ Hits 22019 22027 +8
- Misses 5051 5048 -3
- Partials 1832 1830 -2Generated by Codecov Action |
MathurAditya724
left a comment
There was a problem hiding this comment.
I reviewed this after the test/docs update. I agree with the existing maintainer objection, and this still should not merge as-is.
Blocking finding:
src/lib/error-reporting.ts:61-63silences everyResolutionErrorunconditionally. That is too broad and directly contradicts the earlier review: these are not all random user typos.ResolutionErroris used for event, issue, replay, log, and trace lookup failures that we still want to capture for product/CLI observability.
Additional impact:
src/lib/telemetry.ts:233-238usesclassifySilencedto decide whether to mark the session crashed. After this PR, everyResolutionErrorstops producing a Sentry error and also stops counting as a crashed session. That release-health behavior change is broader than the PR description and is not justified here.- The existing
ResolutionErrorgrouping/context logic insrc/lib/error-reporting.tsbecomes mostly dead for normal command errors, and the replacementcli.error.silencedmetric only recordserror_class=ResolutionErrorplus a generic reason, losing resource/detail context.
If there are specific expected resolution failures that should be silenced, the class needs a narrower signal or subtype/reason. Silencing the entire class is conceptually wrong.
Silencing the entire ResolutionError class was too broad — it suppressed project/issue/replay/log/trace lookup failures that we still want captured for observability, and stopped them counting as crashed sessions. Add an opt-in `expected` flag (default false) and only silence instances that set it. Mark the event-not-found sites in event/view.ts as expected (the originally intended case). Preserve resource_kind on the silenced metric so sub-grouping context isn't lost.
MathurAditya724
left a comment
There was a problem hiding this comment.
Re-reviewed the latest head (3b1abea). The broad class-wide ResolutionError silencing was fixed, but I still do not think this is ready because the new opt-in call sites appear to preserve the behavior BYK objected to.
Remaining blocker:
src/commands/event/view.ts:501-509andsrc/commands/event/view.ts:703-710mark event-not-foundResolutionErrors as{ expected: true }, so those event ID misses are still not captured and still do not mark the session crashed. BYK's review said: “We want to keep these as users don't randomly type bogus event ids.” That objection was specifically about missing/bogus event IDs, so narrowing the class-wide silence to these event lookup sites does not resolve the maintainer concern.
The opt-in mechanism itself is a much safer shape than silencing every ResolutionError, and the tests now correctly preserve capture for plain ResolutionErrors. But before this can merge, the PR needs a maintainer decision on whether event-not-found should actually be silenced. If BYK's stance stands, remove { expected: true } from these event lookup call sites and keep them captured.
|
Thanks @MathurAditya724 — agreed the opt-in shape is the safer mechanism, and you're right that this comes down to a maintainer call on the two event-not-found sites. The reason I marked those But I can see the opposite reading ("keep these captured"), and I don't want to guess on a contested interpretation that flips the PR's behavior. @BYK could you confirm which you meant for the event-not-found case (
I'll make the change as soon as there's a call. |
The
ResolutionErroris thrown when a user provides an event ID that cannot be found, which is an expected user input error. Currently, these errors are not explicitly silenced insrc/lib/error-reporting.ts, leading to them being reported to Sentry as unhandled crash reports. This fix updates theclassifySilencedfunction to recognizeResolutionErrorand classify it as auser_input_error, preventing these expected errors from generating unnecessary Sentry issues.Fixes CLI-RP