Vendor resolvelib 1.1.0 #13001

notatallshaw · 2024-10-11T22:48:21Z

Fixes #12768
Fixes #12317
Fixes #12305
Fixes #12497
Fixes #12754 (comment)

Draft PR of resolvelib beta to see if it passes tests. Will mark as ready to review and ask for review/merge when final is ready.

notatallshaw · 2024-10-18T01:01:53Z

So I checked how resolution performed using https://github.com/notatallshaw/Pip-Resolution-Scenarios-and-Benchmarks, and I have good news and bad news.

Firstly, as expected, due to correctness fixes #12768 and #12317 resolve instead of showing ResolutionImpossible.

Some unexpected good news is this resolves #12305.

Some slightly bad news is this turns #12990 from a Build Failure to a ResolutionTooDeep, but this can be fixed if #13017 lands.

Some worse news is that the requirements in astral-sh/uv#1398 go from resolving (after some time), to ResolutionTooDeep, and I don't have an immediate optimization to fix this. It will require deeper investigation, but I don't think that should stop pip from vendoring resolvelib as it fixes correctness errors and other cases that do produce ResolutionTooDeep.

notatallshaw · 2024-10-31T13:02:09Z

Ready for review / merge once 24.3 is closed.

notatallshaw · 2024-11-01T12:31:47Z

It turns out I accidentally broke depth by changing some resolvelib behavior: pdm-project/pdm#3235 (comment)

I am going to investigate further on this to see if it's causing the performance regression I see in astral-sh/uv#1398, if so I'm going to make a PR on resolvelib, or if removing depth doesn't make any difference on performance then it might be worth dropping as a preference like PDM has.

notatallshaw · 2024-11-10T19:54:21Z

Okay, I've tried fixing depths and removing depths and I ran the scenarios in https://github.com/notatallshaw/Pip-Resolution-Scenarios-and-Benchmarks to see what is the difference (the left hand side is fixing, the right hand side is removing):

uv run compare.py --github-repo-1 "notatallshaw/pip" --git-commit-1 3334e33066c30bef609d54b4d888030ad5dbef61 --github-repo-2 "notatallshaw/pip" --git-commit-2  03d9fd86ab28e9b61066d0e5b1395ddc842b29f6
Reading inline script metadata from `compare.py`
Installed 8 packages in 10ms
Difference for scenario scenarios/problematic.toml - autogluon:
    	Number of requirements processed: 197 -> 177
    	Number of packages processed: 261 -> 241

Difference for scenario scenarios/problematic.toml - TTS:
    	Success: False -> True.
    	Failure Reason: Resolution Too Deep -> None.
    	Number of requirements processed: 143 -> 156
    	Number of packages processed: 357 -> 222

Difference for scenario scenarios/problematic.toml - boto3-urllib3-transient:
    	Number of requirements processed: 429 -> 141
    	Number of packages processed: 811 -> 2675  (Both ended up with ResolutionTooDeep)

Difference for scenario scenarios/problematic.toml - apache-airflow-beam:
    	Number of packages processed: 312 -> 161

Difference for scenario scenarios/problematic.toml - kedro-test:
    	Number of requirements processed: 335 -> 337
    	Number of packages processed: 522 -> 565

Difference for scenario scenarios/problematic.toml - sphinx:
    	Number of requirements processed: 71 -> 79
    	Number of packages processed: 166 -> 199

Difference for scenario scenarios/problematic.toml - backtracks-to-old-scipy:
    	Number of requirements processed: 74 -> 86
    	Number of packages processed: 111 -> 129 (Both end in build failure)

Difference for scenario scenarios/problematic.toml - django-stubs:
    	Number of packages processed: 36 -> 25

Difference for scenario scenarios/big-packages.toml - apache-airflow-all:
    	Not the same install files.
    	Number of requirements processed: 601 -> 597
    	Number of packages processed: 733 -> 696

Difference for scenario scenarios/big-packages.toml - acryl-datahub-all:
    	Number of requirements processed: 347 -> 334
    	Number of packages processed: 711 -> 504

There are some interesting takeaways:

Using depth as a preference causes the TTS Scenario to result in ResolutionTooDeep, vendoring resolvelib 1.1.0 causes this scenario to resolve because it breaks the depth preference
Some scenarios resolve faster with depth removed as a preference, some are faster with depth, except for ResolutionTooDeep cases it seems removing is in general better
Cases where both hit ResolutionTooDeep (i.e. boto3-urllib3-transient) the number of packages processed (and hence the wall clock time) can end up being much bigger with removing depths because it goes deep on one package (in this case botocore because it has thousands of releases)
Of the resolutions which succeeded only the apache-airflow-all scenario produced a different installation result, and without going into too much details it's because apache airflow doesn't put lower bounds on their providers packages: Put reasonable lower bounds on providers in Airflow's extras apache/airflow#39100, making their resolution search space massive

My conclusions from this are:

On the whole, removing depth seems to be a net positive, probably because resolvelib now backjumps and skips of unnecessary backtracking state, and it’s worth removing like PDM already has
It may be worth reducing the maximum number of backtracking steps before hitting ResolutionTooDeep

notatallshaw · 2024-11-10T20:35:49Z

This is ready for review now.

I have removed depth as a preference, which unfortunately means removing the only unit tests that were unit testing the PipProvider (though it is tested greatly as part of functional tests), but I will add additional unit tests to get_preference, and friends, in a follow up PR (that will replace #13017) once this lands.

notatallshaw · 2024-12-14T21:02:45Z

I'm marking this back to draft while we discuss performance issues with resolvelib 1.1, see discussion here: sarugaku/resolvelib#180

At some point we might have to decide whether to vendor this version of resolvelib, which includes an important correctness fix, and some performance improvements, but some big performance regressions, or decide to wait for a resolvelib 1.2 which might be able to regain some of the performance back in very deep backtracking situations (e.g. boto3 / urllib3).

I would very much like input from other maintainers on that, both here and over on the resolvelib performance issue.

ichard26 · 2025-01-12T04:14:12Z

@notatallshaw I'm generally in favour of correctness improvements as AFAIU the user usually has workarounds to deal with excessive backtracking / ResolutionImpossible, i.e. restrict the version ranges pip considers. In contrast, issues like #12317 cause pip to have straight-up broken behaviour. A valid solution exists (and within the backtrack limit), but pip skips over it. Last time I had to debug this with a colleague, it was quite annoying and frustrating.

Also, there are additional optimizations in the works, right? I see that you have a draft optimistic backjumping PR on resolvelib's side, and then there's #12317 (prefer direct conflicts on backtrack) on pip's end.

How common are the known scenarios where this PR is a net negative? As long as they're not common, I'd say the small amount of fallout doesn't outweigh the correctness improvements.

cc @pfmoore @sbidoul

pfmoore · 2025-01-12T14:00:38Z

I agree with @ichard26 - we should prioritise correctness over performance. Claiming there is no solution when there is one, is a bug, and we should prioritise fixing that even if it requires a slower approach.

Pip's focus is on standards compliance and correctness, with performance being secondary (although clearly important as long as the first two criteria don't suffer). If users need performance and are willing to compromise on the other factors, then they are likely to be choosing uv anyway (and even if they prefer to use pip in general, "use uv" is a reasonable workaround for cases where pip's performance is an issue and strict correctness doesn't matter).

notatallshaw · 2025-01-12T14:43:28Z

Thanks both your feedback, the problem here is performance means the user will see a "ResolutionTooDeep", there are definitely real cases involving boto3 and urllib3 as dependencies where this will happen with resolvelib 1.1.0 where it did not happen in the previous version. So if this is merged we are likely to see additional users complain.

If other maintainers are comfortable with that, knowing we are achieving correctness as best we can, at the cost of some users hitting resolution problems for now, then I am going to mark this as ready for review, if we can merge soon, I can create another PR to add a few simple performance improvements with the new API before pip 25.0.

I took a look at implementing a fallback inside resolvelib but also found it would cause at least one known "ResolutionTooDeep" that resolvelib 1.1.0 doesn't 🙁. So I'm still on the fence about that.

I'm going to take a deeper look in a few weeks if we can have our cake and eat it, i.e. introduce this optimization while remaining formally correct, but my conclusion may end up being that the resolvelib API needs to change in non-trivial ways, or we need to consider writing a new resolver, one that learns the lessons from resolvelib, uv, and Poetry.

ichard26

Here's to another day where I am annoyed that botocore has thousands of releases... even if it keeps us honest.

src/pip/_internal/resolution/resolvelib/provider.py

ichard26 · 2025-01-12T17:40:48Z

news/resolvelib.vendor.rst

@@ -0,0 +1 @@
+Upgrade resolvelib to 1.1.0


We should probably call out that this improves correctness, but may introduce worse performance in certain situations. I'm being hand-wavy with my wording as I assume you can come up with something better.

Updated the news item, and seperately I think I will make ResolutionTooDeep a diagnostic exception and point users to a specific pip issue where I can collect scenarios where this is happens.

pfmoore · 2025-01-12T18:05:23Z

Here's to another day where I am annoyed that botocore has thousands of releases... even if it keeps us honest.

I'll be blunt, I think that "thousands of releases" should stress a backtracking resolver (or for that matter any form of resolver). If users are writing requirements that state that any of many thousands of botocore releases are equally valid, then I'd question those requirements. Obviously, no-one is actually going to bother trying to add lower bounds to every botocore requirement out there - and it only takes one unbounded requirement, selected at the wrong point in the process, to blow up. But if we're optimising for the use case of "please pick any one of these 3,000 candidate versions" then something is seriously skewed¹.

Maybe what we should have is some sort of limit in the finder - by default, only consider the latest 100² versions of a project. Then, people using a pathological project like botocore would have to explicitly opt-in with --max-candidates=2000. This would cause some users pain, I'm sure, but it would put the focus on the actual cause of the problem, which is projects that publish unreasonable numbers of releases.

And IMO, what's skewed is botocore's release policy, not our resolver 🙁 ↩
I picked that number out of the air - if I were doing this for real, I'd do some analysis before choosing the default. ↩

notatallshaw · 2025-01-12T18:37:26Z

I'll be blunt, I think that "thousands of releases" should stress a backtracking resolver (or for that matter any form of resolver). If users are writing requirements that state that any of many thousands of botocore releases are equally valid, then I'd question those requirements. Obviously, no-one is actually going to bother trying to add lower bounds to every botocore requirement out there - and it only takes one unbounded requirement, selected at the wrong point in the process, to blow up. But if we're optimising for the use case of "please pick any one of these 3,000 candidate versions" then something is seriously skewed

Setting aside the problem of thousands of releases, these complex dependency graphs naturally arise from how Python packaging handles dependencies. For example, Python package dependencies are an NP hard problem, other languages (e.g. rust and NodeJS) sidestep this by allowing multiple versions of the same library as a solution. Even without boto3 there are many other examples where resolution is very hard, just boto3 sticks out more obviously because it can cause a huge number of downloads.

Maybe what we should have is some sort of limit in the finder - by default, only consider the latest 1002 versions of a project. Then, people using a pathological project like botocore would have to explicitly opt-in with --max-candidates=2000. This would cause some users pain, I'm sure, but it would put the focus on the actual cause of the problem, which is projects that publish unreasonable numbers of releases.

I'm dubious but would be happy to discuss elsewhere, I think this would break resolutions in a non-obvious way. Let's say you have some package foo that has versions 1 to 10'000, and out limit is 1000. If the user depends on A and B, and the latest version of A depends on foo > 5000 and the latest version of B depends on foo < 8000, the user would get a ResolutionImpossible but only after backtracking through every version of B, unless an old version of B didn't put an upper bound on foo but was functionally incompatible the user would find a solution but one in which B was functionally broken.

And the problem is the number of possible solutions can grow exponentially along dimensions other than number of candidates, e.g. number of requirements. So this doesn’t work when the number of requirements are causing the issue, in which case you may need a very small number of max candidates (e.g. 10) to avoid a ResolutionTooDeep, and you may just get a ResolutionImpossible instead and you would have to do significant analysis to understand why.

pfmoore · 2025-01-12T22:19:06Z

Setting aside the problem of thousands of releases, these complex dependency graphs naturally arise from how Python packaging handles dependencies.

Agreed 100%. I'm not trying to minimise the complexity of the problem here, but I think that packages like botocore with thousands of releases are (and should be treated as) very much outliers. I don't think we should compromise correctness, or the performance of simpler cases, just to accommodate such projects.

I'm dubious but would be happy to discuss elsewhere, I think this would break resolutions in a non-obvious way.

It's quite possible it would. I haven't done any sort of analysis at this point. I'm happy to drop the idea.

psf-chronographer bot added the bot:chronographer:provided label Oct 11, 2024

notatallshaw force-pushed the vendor-resolvelib-1.1.0 branch from 167e384 to d7e6ee3 Compare October 11, 2024 23:11

notatallshaw mentioned this pull request Oct 14, 2024

Request: New Release sarugaku/resolvelib#159

Closed

This was referenced Oct 23, 2024

pip prefers old sdists that "obviously" can't work over recent wheels #13037

Open

Request: New Final Release sarugaku/resolvelib#170

Closed

notatallshaw force-pushed the vendor-resolvelib-1.1.0 branch from d7e6ee3 to 88ceab2 Compare October 31, 2024 11:50

notatallshaw marked this pull request as ready for review October 31, 2024 11:50

notatallshaw added this to the 25.0 milestone Oct 31, 2024

notatallshaw force-pushed the vendor-resolvelib-1.1.0 branch from a87fca9 to d309e2c Compare October 31, 2024 12:26

notatallshaw mentioned this pull request Nov 1, 2024

chore(deps): update resolvelib to 1.1.0 pdm-project/pdm#3235

Merged

2 tasks

notatallshaw marked this pull request as draft November 1, 2024 12:53

notatallshaw mentioned this pull request Nov 10, 2024

pip._vendor.resolvelib.resolvers.ResolutionTooDeep: 200000 error #12754

Open

1 task

notatallshaw added 3 commits November 10, 2024 15:07

Upgrade resolvelib to 1.1.0

79ef1ce

Remove depth preference

9308566

Fix type hints for resolvelib 1.1.0

3a157bb

notatallshaw force-pushed the vendor-resolvelib-1.1.0 branch from d309e2c to 3a157bb Compare November 10, 2024 20:14

notatallshaw marked this pull request as ready for review November 10, 2024 20:35

notatallshaw requested a review from pradyunsg November 10, 2024 20:35

This was referenced Nov 10, 2024

Vendoring bumps for 25.0 #13074

Merged

Prefer upper bounds when resolving/backtracking #13017

Closed

PoC: Very verbose resolution #13039

Closed

Simplify, fix, and add unit tests for PipProvider.get_preference #12982

Closed

notatallshaw mentioned this pull request Nov 28, 2024

Pip try to install twice the same package at different versions, leading to conflicts #13092

Open

1 task

Merge branch 'main' into vendor-resolvelib-1.1.0

e8a2303

ichard26 requested a review from uranusjr December 9, 2024 04:02

notatallshaw mentioned this pull request Dec 14, 2024

Back jump doesn't do well in some cases sarugaku/resolvelib#180

Open

notatallshaw marked this pull request as draft December 14, 2024 20:58

ichard26 mentioned this pull request Dec 16, 2024

Release 25.0 #13103

Open

sbidoul added the C: dependency resolution About choosing which dependencies to install label Jan 12, 2025

notatallshaw marked this pull request as ready for review January 12, 2025 14:43

Merge branch 'main' into vendor-resolvelib-1.1.0

9a577ea

ichard26 reviewed Jan 12, 2025

View reviewed changes

Remove resolution depth from docs and doc string

59cbe76

notatallshaw force-pushed the vendor-resolvelib-1.1.0 branch from bd61563 to 96ef7e2 Compare January 12, 2025 18:11

Update news item

aaa3ce7

notatallshaw force-pushed the vendor-resolvelib-1.1.0 branch from 96ef7e2 to aaa3ce7 Compare January 12, 2025 18:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vendor resolvelib 1.1.0 #13001

Vendor resolvelib 1.1.0 #13001

notatallshaw commented Oct 11, 2024 •

edited

Loading

notatallshaw commented Oct 18, 2024 •

edited

Loading

notatallshaw commented Oct 31, 2024

notatallshaw commented Nov 1, 2024

notatallshaw commented Nov 10, 2024 •

edited

Loading

notatallshaw commented Nov 10, 2024 •

edited

Loading

notatallshaw commented Dec 14, 2024 •

edited

Loading

ichard26 commented Jan 12, 2025

pfmoore commented Jan 12, 2025

notatallshaw commented Jan 12, 2025 •

edited

Loading

ichard26 left a comment

ichard26 Jan 12, 2025

notatallshaw Jan 12, 2025 •

edited

Loading

pfmoore commented Jan 12, 2025

notatallshaw commented Jan 12, 2025 •

edited

Loading

pfmoore commented Jan 12, 2025

Vendor resolvelib 1.1.0 #13001

Are you sure you want to change the base?

Vendor resolvelib 1.1.0 #13001

Conversation

notatallshaw commented Oct 11, 2024 • edited Loading

notatallshaw commented Oct 18, 2024 • edited Loading

notatallshaw commented Oct 31, 2024

notatallshaw commented Nov 1, 2024

notatallshaw commented Nov 10, 2024 • edited Loading

notatallshaw commented Nov 10, 2024 • edited Loading

notatallshaw commented Dec 14, 2024 • edited Loading

ichard26 commented Jan 12, 2025

pfmoore commented Jan 12, 2025

notatallshaw commented Jan 12, 2025 • edited Loading

ichard26 left a comment

Choose a reason for hiding this comment

ichard26 Jan 12, 2025

Choose a reason for hiding this comment

notatallshaw Jan 12, 2025 • edited Loading

Choose a reason for hiding this comment

pfmoore commented Jan 12, 2025

Footnotes

notatallshaw commented Jan 12, 2025 • edited Loading

pfmoore commented Jan 12, 2025

notatallshaw commented Oct 11, 2024 •

edited

Loading

notatallshaw commented Oct 18, 2024 •

edited

Loading

notatallshaw commented Nov 10, 2024 •

edited

Loading

notatallshaw commented Nov 10, 2024 •

edited

Loading

notatallshaw commented Dec 14, 2024 •

edited

Loading

notatallshaw commented Jan 12, 2025 •

edited

Loading

notatallshaw Jan 12, 2025 •

edited

Loading

notatallshaw commented Jan 12, 2025 •

edited

Loading