-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infrastructure for Azure pipelines underdimensioned #33980
Comments
We currently trigger 5*8=40 jobs daily, and 3*8=24 of those trigger every 3 hours, while we only have 20 parallel jobs. We don't show the Edge Canary results on wpt.fyi by default, so reduce them to once a week to reduce load. Helps with web-platform-tests#33980.
We have a maximum of 20 parallel jobs on Azure Pipelines, but after #33755 + #33861 we can trigger up to 40 jobs at the same time, each of which is expected to take ~2 hours. I've sent #34015 so that only 16 jobs get triggered every 3 hours, but we will still have a backlog every day, and the same 40 jobs as currently once a week, leading to delays. @mustjab do you think there's anything we could do about the quota? Or other ways to solve this? |
I think we can stop Edge Dev runs and can just do Canary runs for now. Also, for the weekly run, can we schedule to run on the weekend when we have fewer runs? @foolip Do you remember who you worked with to increase the parallel job limit before? I can also try to outreach to them and see if we can increase that a bit more. |
@mustjab I don't know for certain, but I think it was @thejohnjansen who asked someone on the Azure Pipelines team to increase the limit internally. The mechanism for doing that wasn't visible to me, I could only see the increased parallelism take effect. Regarding Edge Canary, note that because of web-platform-tests/wpt.fyi#1635 we don't show those runs on wpt.fyi. However, with that issue fixed we could start using the Edge Canary runs instead. In any event, I think we should run either Edge Canary or Edge Dev, not both. |
Let's stop Edge Canary runs and keep only Edge Dev channel runs. We can switch these runs to daily instead of every 3 hours to help reduce the load. Does that work? |
@mustjab #34015 was merged which will run Chrome Canary only once a day. We could remove it entirely, if you like. However, running Edge Dev less frequently wouldn't be great because it would mean that the wpt.fyi front page gets new aligned runs less often. And it would take longer to recover from any infra issue on any browser. Also note that peak usage does not decrease at all unless we find a mechanism to spread out runs over time, since currently epochs/three_hourly and epochs/daily are both updated at the same time once a day. Getting backlogged once a day is better than it happening every 3 hours of course, but it would still affect wpt contributors. |
Thanks for merging that. Let's see if that helps with the load and if we still see issues with that, then we can stop these runs until we figure out a way to increase the limit. For Edge Dev channel, is there a different cadence that we can do other than every 3 hours? Maybe every 6 hours, to reduce the load? That should still keep the wpt.fyi front page results fresh enough. |
I fully agree with you. This has happened to me on may 10th 2023 (see #39947) and presumably also in #34926. |
I don't know precisely how this is provisioned, but it seems like the infrastructure that runs "Azure pipelines" in the continuous integrations tests is undermentioned. Once it runs, it's pretty fast, but it can stay queued for extended periods of time.
For instance, #33940 was blocked for about 2 hours waiting for the Azure Pipelines to be run. In the grand scheme of things, 2h may not be that much, but it's very different from 10 minutes, and changes a task that you can do in one sitting into something you have to handle in multiple work sessions, which is unwelcome overhead. If possible, it'd be nice to reduce that delay.
Thanks!
The text was updated successfully, but these errors were encountered: