-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce build and test coverage to cope with Azure limits #1994
Conversation
On Wed, Jul 05, 2023 at 12:10:00PM -0700, Guillaume Tucker wrote:
As we're moving to the new Azure subscription we're currently limited in terms of build and network bandwidth capacity. As such, reduce builds by disabling allmodconfig and reducing standard tree coverage to the minimal variants. Also reduce test coverage to only have a couple of devices running each test plan to minimise the downloads from storage.
Can we put a CDN in front of storage to mitigate the issue with
downloads from storage, Azure has:
https://azure.microsoft.com/en-gb/products/cdn/
(probably a good idea in general anyway, both from a load on storage and
cost perspectives.)
|
Azure CDN prices are almost same as egress, they have just better geographical proximity to users: On my opinion to reduce costs we have following options: 2)Caching proxy on this hetzner server, so any file from storage.* served only once. Still might reduce bandwidth a lot. Maybe there is more options, i am not sure... |
I really think this is an interim solution to avoid hitting some limits inadvertently, and tbh the kernel builds are probably costing us much more than the binary downloads. But we first need to bring the costs to the bare minimal to ensure continuity of the main services and there's no doubt we'll find a solution in the coming weeks to bring things back to normal. Making efficient use of the available resources is important in any case, aside from how much things cost it will lead to better performance overall. Let's see how things go with the changes in this PR with the full linux-next build on staging this weekend and probably we'll already be able to find a balance in between the full set we had and this next week even without changing how the infrastructure is setup. Then if things go well we'll end up with a more efficient config as well as some long-term sustainable amount of resources. |
Actually Azure(and other cloud services) is not feasible (unless you are very rich corporation or startup with lot of funding) even for transfer volumes we have, i believe optimizing storage and egress required for the long-term. |
Ah, that's a shame with the cache service - for AWS CloudFront in front of a S3 buckets actually super cost effective - I'm not paying any bandwidth costs for the binaries I serve up to my lab for my CI (uploads to S3 are free, transfers from S3 to CloudFront are free and even when I get out of the free tier on the CDN it's their cheapest bandwidth IIRC). |
I am not sure on exact numbers on our egress, because for example we have transfers of sources to GKS nodes (classified as egress), serving kernels to labs and etc, but looking to just number on production instance eth0 - it might be quite significant traffic, 5-8TB/week.
Vultr:
(And they have pretty significant free tier) But even with external server generating content things might be complicated and egress charges will be significant. For example our build K8S clusters will upload binaries elsewhere, this is still egress. |
This PR is now producing the kind of discussions we need for kernelci/kernelci-api#9 ;) |
Right, a CDN is definitely not an ideal solution once you get too far over the free tier - I was just thinking that they're really simple and non-invasive to enable so if the pricing worked out with Azure it might've helped mitigate things with little effort. It seems like there's not enough of a free/cheap tier with them to be relevant for us sadly :( |
While we're transitioning the Azure resources to a new subscription, we need to drastically reduce the build load in order to keep the costs under control. This is meant to be a temporary measure, although some trees might need to stay on minimal variants permanently as a general opmisation effort. Signed-off-by: Guillaume Tucker <[email protected]>
While we're transitioning to the new Azure subscription, reduce the test coverage to the bare minimal to minimise bandwidth usage in downloads. This is meant to be a short-term interim measure to keep the costs under control until we have a new sustainable solution. Signed-off-by: Guillaume Tucker <[email protected]>
5116230
to
d90acbc
Compare
There was still a couple of allmodconfig left as an oversight, fixed that now. Otherwise the linux-next results from the staging weekend run are available here: |
Nobody has replied to the email thread or mentioned any blocking issue with this PR so it looks like it's all ready to go for today's production update. |
As we're moving to the new Azure subscription we're currently limited in terms of build and network bandwidth capacity. As such, reduce builds by disabling allmodconfig and reducing standard tree coverage to the minimal variants. Also reduce test coverage to only have a couple of devices running each test plan to minimise the downloads from storage.