Skip to content

Narrow regex subject to non-decimal-int-string in the non-matching branch of preg_match#5804

Open
phpstan-bot wants to merge 1 commit into
phpstan:2.2.xfrom
phpstan-bot:create-pull-request/patch-7fc2q4x
Open

Narrow regex subject to non-decimal-int-string in the non-matching branch of preg_match#5804
phpstan-bot wants to merge 1 commit into
phpstan:2.2.xfrom
phpstan-bot:create-pull-request/patch-7fc2q4x

Conversation

@phpstan-bot
Copy link
Copy Markdown
Collaborator

Summary

When a string is tested against an anchored decimal-integer regex such as /^-?[0-9]+$/, PHPStan already narrows the subject to decimal-int-string in the matching branch (added in the decimal-int-string regex work). However, the non-matching branch was left as plain string. This change narrows the subject to non-decimal-int-string in that branch, completing the dichotomy string = decimal-int-string | non-decimal-int-string.

Changes

  • src/Type/Php/PregMatchTypeSpecifyingExtension.php:
    • The subject-narrowing block now runs for both the truthy and falsey contexts (previously only truthy).
    • Added negateSubjectType() which, for the falsey branch, maps the matched subject type to its representable complement. Only decimal-int-stringnon-decimal-int-string is mapped; everything else returns null (no narrowing).
    • In the falsey branch the complement is specified with a true context so the subject is set to the complement; the truthy branch keeps its original behavior.
  • tests/PHPStan/Analyser/nsrt/bug-14766.php: regression test.

Root cause

The pattern axis here is branch direction of preg_match subject narrowing (truthy vs falsey). The extension only handled the truthy branch, so the falsey branch never narrowed the subject. The general complement of an accessory string refinement within string is not representable as a single type (e.g. the complement of non-empty-string is '' | (non-empty strings failing the pattern)), with one exception: decimal-int-string and non-decimal-int-string partition all strings, so each is the other's complement. That single representable pair is what gets narrowed; all other refinements are deliberately left as string in the falsey branch.

Test

tests/PHPStan/Analyser/nsrt/bug-14766.php covers:

  • the reported case (/^-?[0-9]+$/ → else branch is non-decimal-int-string);
  • the negated condition !preg_match(...) and an early-return form;
  • a combined subject + $matches case (subject narrows to non-decimal-int-string, $matches to array{});
  • two negative controls confirming non-empty-string (unanchored /[0-9]/, anchored /^\S+$/) subjects are not narrowed in the else branch, since their complement is not representable.

Probed analogous constructs that were already correct and intentionally left unchanged: preg_match_all (shares the same subject-narrowing path), and TypeCombinator::remove/StringType::tryRemove (which, consistent with the existing non-empty handling, do not simplify accessory-intersection complements).

Fixes phpstan/phpstan#14766

…branch of `preg_match`

- PregMatchTypeSpecifyingExtension now narrows the subject expression in the falsey branch too, not only the truthy branch.
- When the matching branch narrows the subject to `decimal-int-string` (anchored digit patterns like `/^-?[0-9]+$/`), the non-matching branch is narrowed to its complement `non-decimal-int-string` via a new negateSubjectType() helper.
- Only the decimal-int-string ↔ non-decimal-int-string pair has a complement representable within `string`, so other subject refinements (non-empty-string, non-falsy-string, …) are left untouched in the falsey branch, matching the existing conservative behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support non-decimal-int-string in regex matching

1 participant