Skip to content

fix: make GitHub PR-description ticket selection deterministic#2422

Open
dellch wants to merge 2 commits into
The-PR-Agent:mainfrom
dellch:fix/deterministic-ticket-selection
Open

fix: make GitHub PR-description ticket selection deterministic#2422
dellch wants to merge 2 commits into
The-PR-Agent:mainfrom
dellch:fix/deterministic-ticket-selection

Conversation

@dellch
Copy link
Copy Markdown

@dellch dellch commented Jun 2, 2026

Fixes #2421

Summary

extract_ticket_links_from_pr_description() in pr_agent/tools/ticket_pr_compliance_check.py accumulates GitHub issue URLs in a set, then enforces the 3-ticket cap by slicing a list built from that set:

github_tickets = set()
...
if len(github_tickets) > 3:
    github_tickets = set(list(github_tickets)[:3])
...
return list(github_tickets)

Because a set has no defined iteration order (and Python randomizes string hashing per process via PYTHONHASHSEED), list(github_tickets)[:3] selects an arbitrary 3 of the referenced issues, and the selection can vary between runs. When a PR description references more than 3 issues, which issues get fetched for review context is therefore nondeterministic.

Fix

Track URLs in first-seen order while de-duplicating (a seen set plus an ordered list), then apply the cap by slicing the ordered list — mirroring what find_jira_tickets() in the same file already does. Selection is now stable and predictable. Behaviour is unchanged for 3 or fewer issues.

Tests

Structured as two commits — failing tests first, then the fix:

  1. test: adds TestExtractTicketLinksFromPrDescription with two tests.
  2. fix: makes them pass.

test_cap_selects_deterministic_first_seen_subset is the reliable regression guard: with more than 3 issues, the old set-based code returns an arbitrary subset that does not equal the first-seen subset. I verified it fails against the previous implementation on every hash seed PYTHONHASHSEED 0–11, and passes with the fix on all of them. (test_preserves_first_seen_order documents the intended ordering; its docstring notes it is not a reliable seed-independent guard on its own, which is why the cap test exists.)

Notes

This was originally surfaced by the Qodo review bot on a separate Jira-support PR, but the function predates that work and is unrelated to Jira, so it is raised here on its own.

Test plan

  • pytest tests/unittest/test_extract_issue_from_branch.py
  • New tests fail on the pre-fix code (verified across PYTHONHASHSEED 0–11) and pass with the fix

Co-Authored-By: Claude

…tion

extract_ticket_links_from_pr_description() accumulates issue URLs in a set and
caps via list(set)[:3], so when a PR description references more than 3 issues
the selected subset is arbitrary and varies between runs.

These tests assert first-seen ordering and a deterministic capped subset. The
cap test fails against the current set-based implementation on every hash seed
(verified across PYTHONHASHSEED 0-11); the fix follows in the next commit.

Co-Authored-By: Claude
@github-actions github-actions Bot added the bug label Jun 2, 2026
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Make GitHub PR-description ticket selection deterministic

🐞 Bug fix 🧪 Tests

Grey Divider

Walkthroughs

Description
• Fix nondeterministic ticket selection in PR description parsing
  - Replace set-based accumulation with ordered list tracking
  - Preserve first-seen order while de-duplicating URLs
  - Cap selection now returns deterministic first 3 tickets
• Add comprehensive test suite for ticket extraction
  - Test first-seen ordering preservation
  - Test deterministic capped subset selection (reliable regression guard)
Diagram
flowchart LR
  A["PR Description"] -->|extract URLs| B["Ordered List + Seen Set"]
  B -->|deduplicate| C["Preserve First-Seen Order"]
  C -->|cap at 3| D["Deterministic Subset"]
  E["Tests"] -->|verify| D

Loading

Grey Divider

File Changes

1. pr_agent/tools/ticket_pr_compliance_check.py 🐞 Bug fix +15/-6

Replace set-based ticket tracking with ordered list

• Replace set() accumulation with ordered list and seen set for deterministic ordering
• Add _add() helper function to track URLs in first-seen order while de-duplicating
• Change cap logic from set(list(github_tickets)[:3]) to github_tickets[:3]
• Return ordered list directly instead of converting from set

pr_agent/tools/ticket_pr_compliance_check.py


2. tests/unittest/test_extract_issue_from_branch.py 🧪 Tests +40/-1

Add tests for deterministic PR description ticket selection

• Import extract_ticket_links_from_pr_description function for testing
• Add MAX_TICKETS constant (3) matching function's hardcoded cap
• Add TestExtractTicketLinksFromPrDescription test class with two tests
• Test test_preserves_first_seen_order validates ordering and deduplication
• Test test_cap_selects_deterministic_first_seen_subset is reliable regression guard

tests/unittest/test_extract_issue_from_branch.py


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects Bot commented Jun 2, 2026

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0) 🔗 Cross-repo conflicts (0)

Context used

Grey Divider


Remediation recommended

1. Single-quoted f-strings added ✓ Resolved 📘 Rule violation ⚙ Maintainability
Description
The updated GitHub ticket URL construction uses single-quoted f-strings, which violates the
repository’s Python string-quoting convention. This can cause avoidable lint/format churn and
inconsistent style across the codebase.
Code

pr_agent/tools/ticket_pr_compliance_check.py[R61-65]

Evidence
PR Compliance ID 11 requires following repository Python formatting conventions, including
preferring double quotes. The modified lines construct URLs using single-quoted f-strings (e.g.,
f'{base_url_html.strip("/")}/...').

AGENTS.md: Follow Repository Python Formatting and Import Ordering (Ruff/isort, 120-char lines, double quotes): AGENTS.md: Follow Repository Python Formatting and Import Ordering (Ruff/isort, 120-char lines, double quotes): AGENTS.md: Follow Repository Python Formatting and Import Ordering (Ruff/isort, 120-char lines, double quotes): AGENTS.md: Follow Repository Python Formatting and Import Ordering (Ruff/isort, 120-char lines, double quotes)
pr_agent/tools/ticket_pr_compliance_check.py[61-65]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
New/modified URL f-strings use single quotes, but the repository style requires preferring double quotes for Python string literals.
## Issue Context
These lines were modified in this PR, so they should be brought into compliance with the repo’s Ruff/style expectations to avoid lint failures and style drift.
## Fix Focus Areas
- pr_agent/tools/ticket_pr_compliance_check.py[61-65]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Previous review results

Review updated until commit 03979aa

Results up to commit 8e08857


🐞 Bugs (0) 📘 Rule violations (1) 📎 Requirement gaps (0) 🔗 Cross-repo conflicts (0)

Context used

Remediation recommended
1. Single-quoted f-strings added 📘 Rule violation ⚙ Maintainability
Description
The updated GitHub ticket URL construction uses single-quoted f-strings, which violates the
repository’s Python string-quoting convention. This can cause avoidable lint/format churn and
inconsistent style across the codebase.
Code

pr_agent/tools/ticket_pr_compliance_check.py[R61-65]

Evidence
PR Compliance ID 11 requires following repository Python formatting conventions, including
preferring double quotes. The modified lines construct URLs using single-quoted f-strings (e.g.,
f'{base_url_html.strip("/")}/...').

AGENTS.md: Follow Repository Python Formatting and Import Ordering (Ruff/isort, 120-char lines, double quotes)
pr_agent/tools/ticket_pr_compliance_check.py[61-65]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
New/modified URL f-strings use single quotes, but the repository style requires preferring double quotes for Python string literals.

## Issue Context
These lines were modified in this PR, so they should be brought into compliance with the repo’s Ruff/style expectations to avoid lint failures and style drift.

## Fix Focus Areas
- pr_agent/tools/ticket_pr_compliance_check.py[61-65]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Qodo Logo

Track issue URLs in first-seen order while de-duplicating (a seen set plus an
ordered list), then apply the 3-ticket cap by slicing the ordered list, instead
of accumulating in a set and slicing list(set)[:3]. Selection is now stable and
predictable across runs. Behaviour is unchanged for 3 or fewer issues.

Makes the tests added in the previous commit pass.

Co-Authored-By: Claude
@dellch dellch force-pushed the fix/deterministic-ticket-selection branch from 8e08857 to 03979aa Compare June 2, 2026 22:43
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects Bot commented Jun 2, 2026

Code review by qodo was updated up to the latest commit 03979aa

@dellch
Copy link
Copy Markdown
Author

dellch commented Jun 2, 2026

Good catch — fixed in 03979aa.

To clarify the origin: those two single-quoted f-strings are verbatim from the existing code on main (the diff only wraps them in the new _add(...) helper, it did not introduce the quoting). But since this PR is already modifying those exact lines, and the rest of this file uses double-quoted f-strings (11 double vs the 2 single ones here — including the identical URL construction a few lines down), I normalized them to double quotes to match the local convention and remove the inconsistency:

_add(f"{base_url_html.strip(chr(47))}/{owner}/{repo}/issues/{issue_number}")

(inner quotes flipped to single so they do not clash with the outer double quotes). Tests still pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nondeterministic ticket selection when a PR description references more than 3 issues

1 participant