#3122 Consider changing maximum size for network install images (also affects Workstation)
Closed: Accepted a year ago by zbyszek. Opened a year ago by adamwill.

For F40 currently, the network install images (Everything and Server - the two are always the same size) and Workstation live image are oversize (the current size target is 700M, the size of a CD):

As described here, a lot of the growth is caused by linux-firmware getting bigger. This seems like it is going to be an ongoing issue: graphics card and wireless adapter firmwares seem to be following a pattern where there are a lot of fairly large images and they just keep adding more. I think we can expect linux-firmware to just keep increasing in size as time goes on, without us really being able to do much more to cut bits of it off the images; we need wireless and graphics card firmwares for the installer / live environments. Tagging @pbrobinson to confirm or deny this.

From my investigation it seems like we kind of duplicate kernel, driver and firmware files on the network installer images. The ISOs contain an images/install.img and also an images/pxeboot/initrd.img, and both of these contain lib/modules, lib/firmware and kernel files. I don't know if it's possible to de-duplicate these somehow, but if it is, that could provide some temporary relief for a while. Maybe @bcl would know something about that.

But beyond that, we may just need to bump these sizes and recognize we'll have to keep increasing the size targets in the future, and be clear the blame lies with the vendors who keep shipping these firmware files (Intel, NVIDIA, AMD principally). We could, I guess, look at building firmware-less images, but then we have more deliverables to test...

Filing with FESCo as it seems like the appropriate entity for the Everything netinst size at least, and Server netinst just goes along with Everything. CCing @catanzaro and @kalev for the Workstation angle.


It looks like the kernel and initramfs under /images/pxeboot is the one that actually gets booted on boot of the network installer image. I don't know if that means we could actually get rid of the kernel, kernel modules and/or firmware files from within the install.img image (which I believe is the actual installer environment we ultimately boot into - the initramfs environment mounts that image and does a switch root, I think).

Forgot to note, another angle on this is that Colin wants to add podman to the installer environment, which adds about another 20M to the network install image sizes.

CCing @catanzaro and @kalev for the Workstation angle.

The purpose of the Workstation image size limits is to make sure the image size is increasing for some good reason rather than just incidentally or accidentally. It's OK to increase the size limit after the increase has been investigated. Firmware is a good enough reason to increase.

You could only get rid of those if they're not loaded after initial boot. I'm fairly sure some of them are, so trying to play whack-a-file with them is just going to make things harder. The files in the initramfs are usually a subset of the ones in the rootfs, but we also try hard to clean those up with removekmod in the runtime-cleanup.tmpl.

As much as I want to keep things minimal, I think we have to admit that there's only so much we can do and allow larger images -- but as slowly as possible :)

I think we should increase the sizes. Do you have some specific numbers in mind?

How long is a piece of string? :) 1G is the next kinda 'round number' and would let the problem slide for a while. The trade-off is always that we generally won't notice images getting bigger till they start exceeding the next limit, so if we make it 1G, we'll get another 300M of 'bloat' before we take another look.

For WS we can bump it to whatever @catanzaro suggests, I guess - two things need updating, https://pagure.io/fedora-pgm/pgm_docs/blob/main/f/releases/modules/ROOT/pages/f40/blocking.adoc (the human-readable values) and https://fedorapeople.org/groups/qa/metadata/relvalsizes.json (the data file relval uses to enforce them). Once we have a decision I can update the latter and send a PR for the former.

Could I suggest an alternative? Rather than deciding on new fixed amounts, perhaps we can just set an "allowed percentage of growth" since the last release, then we only have to bring it up to FESCo if we're outpacing that amount?

Do we have historical data we can use to get a reasonable growth rate to work from?

Metadata Update from @sgallagh:
- Issue tagged with: stalled

a year ago

1G is the next kinda 'round number' and would let the problem slide for a while.

+1

set an "allowed percentage of growth" since the last release, then we only have to bring it up to FESCo if we're outpacing that amount?

I don't think we have the tooling to do this automatically, so somebody would need to keep updating the value. I think it's easier to set it to some fixed value. We'll have to revisit the topic again in a year or two, but that's not so bad.

Agree with that. In general this can't be predicted because technologies are changing all the time.

Thank you for discussing this we are blocked by this on https://github.com/rhinstaller/anaconda/pull/5285

Could I suggest an alternative? Rather than deciding on new fixed amounts, perhaps we can just set an "allowed percentage of growth" since the last release, then we only have to bring it up to FESCo if we're outpacing that amount?

Do we have historical data we can use to get a reasonable growth rate to work from?

as the maintainer of the tool that checks this, I was under the impression that there would be no math.

(that's a silly gag but also a semi-serious point: that sounds like it would be more complex to handle than just 'it's a flat number that we bump sometimes')

fedfind should be able to get the historical data you desire, but it seems that for some reason even after this got merged the imagelist for archive does not have file sizes so it only has data for releases that are in PDC. Here's that:

Release: 24 Size: 459276288
Release: 25 Size: 500170752
Release: 26 Size: 505413632
Release: 27 Size: 532676608
Release: 28 Size: 611319808
Release: 29 Size: 621805568
Release: 32 Size: 709885952
Release: 33 Size: 719323136
Release: 34 Size: 678428672
Release: 35 Size: 677380096
Release: 36 Size: 702545920
Release: 37 Size: 697413632
Release: 38 Size: 718284800
Release: 39 Size: 719116288

I think for 30 and 31 the metadata wasn't uploaded to PDC for some reason so fedfind doesn't have the size.

that data isn't super useful I guess because we've been pulling heroic efforts to keep it under 700M since 32.

OK, bit of fiddling later, here we go:

Release: 1 Size:    4818944
Release: 2 Size:    4880384
Release: 3 Size:    5982208
Release: 4 Size:    6825984
Release: 5 Size:    7237632
Release: 6 Size:    8544256
Release: 7 Size:    8161280
Release: 8 Size:    9617408
Release: 9 Size:  120438784
Release: 10 Size: 135981056
Release: 11 Size: 164648960
Release: 12 Size: 179306496
Release: 13 Size: 218103808
Release: 14 Size: 230686720
Release: 15 Size: 208404480
Release: 16 Size: 283115520
Release: 17 Size: 169869312
Release: 18 Size: 308281344
Release: 19 Size: 332398592
Release: 20 Size: 336592896
Release: 21 Size: 444596224
Release: 22 Size: 469762048
Release: 23 Size: 435159040
Release: 24 Size: 459276288
Release: 25 Size: 500170752
Release: 26 Size: 505413632
Release: 27 Size: 532676608
Release: 28 Size: 611319808
Release: 29 Size: 621805568
Release: 30 Size: 627048448
Release: 31 Size: 681574400
Release: 32 Size: 709885952
Release: 33 Size: 719323136
Release: 34 Size: 678428672
Release: 35 Size: 677380096
Release: 36 Size: 702545920
Release: 37 Size: 697413632
Release: 38 Size: 718284800
Release: 39 Size: 719116288

source:

import fedfind.release
import requests

for i in range(1,40):
    try:
        rel = fedfind.release.get_release(i)
        img = [i for i in rel.all_images if i["subvariant"] in ("Everything", "Server") and i["type"] == "boot" and i["format"] == "iso" and i["arch"] == "x86_64"][0]
        size = img.get("size")
        if not size:
            url = img["direct_url"]
            size = requests.head(url).headers.get("Content-Length")
        if size:
            print(f"Release: {rel.release} Size: {size}")
    except:
        pass

the big jump between 8 and 9 is I think related to the change in the old 'stage1/stage2' design - that's when we dropped it, or included stage2 on the ISO, or something.

fedfind should be able to get the historical data you desire, but it seems that for some reason even after this got merged the imagelist for archive does not have file sizes so it only has data for releases that are in PDC. Here's that:

Infra has a seperate copy of the update script: ./files/scripts/update-fullfiletimelist

That likely got out of sync.

so can we make a decision on this ahead of the beta cycle for F40? right now these bugs are still blocking Beta.

If folks just want a recommendation to vote on, I'd say bump the netinst max size to 1G for now. It's simple.

@catanzaro , can I take https://pagure.io/fesco/issue/3122#comment-888126 as permission/approval to bump the Workstation max size to 2.3G for now?

Thanks!

I've filed https://pagure.io/fedora-pgm/pgm_docs/pull-request/64 , which would bump netinsts to 1G and WS live to 2.3G , just to be efficient (if that proposal is approved in this ticket we can just merge the PR immediately). I would then have to update the sizes relval actually uses to run the check to make the bugs go away.

+1, I guess. I still kind of prefer going the route of "Growth over N% since last release is a blocker", but I'm not going to die on that hill.

we can always do that as a follow-up, it just seems like a bigger more complex change and I'd like to get these off the blocker list for F40 (assuming we all think that's the sensible thing to do).

@catanzaro , can I take https://pagure.io/fesco/issue/3122#comment-888126 as permission/approval to bump the Workstation max size to 2.3G for now?

Yes.

We are at (+2, 0, 0). Let's add this to the next agenda and decide something.

Metadata Update from @zbyszek:
- Issue untagged with: stalled
- Issue tagged with: meeting

a year ago

So, sorry, there's one more case to consider. I went back through the bugs and realized one isn't the same as the others and we're kinda overlooking it: https://bugzilla.redhat.com/show_bug.cgi?id=2247611

That's for the Workstation and Minimal aarch64 disk images. @lbrabec pointed out that the uncompressed Workstation disk image went over 16GB (that's power-of-ten gigabytes) last year. This is a significant threshold, because it's a common USB stick size, and it's the uncompressed image that gets written to disk.

Minimal is around 6.5GB (it's 6GiB), while the max size is specified as 4GB, but I think this is just some kind of mistake, because AFAICT the x86_64 image has never been under 4GB in size (it was introduced at 5GiB size in 2017).

So I think for Minimal we should just bump the specified max size (perhaps to 8GB, the next obvious USB stick size up from 4GB), but for Workstation just bumping the size might not be the best option. As I noted in the bug, we bumped it from 14GiB to 16GiB because of https://fedoraproject.org/wiki/Changes/BiggerESP , and that change mostly got reverted, but I'm worried bloat since then (especially in linux-firmware) might mean that if we drop it all the way back to 14GiB, it might still not build any more. I will ask releng to try once the FAS emergency is dealt with.

Unfortunately, AFAICT, oz only allows specifying image sizes in integer power-of-two GiB (or TiB). So it's possible we'll be stuck in a situation where making the image 16GB or 14.5GiB would work, but oz won't let us do it :/ I hope not, though.

okay, we did wind up in exactly that situation, but I fixed oz to handle it. My recommendation for the Workstation disk image is therefore that we merge that oz fix, this accompanying koji fix, deploy those changes to the builders, and update pungi-fedora to specify the image size as "16GB". @kevin did a scratch build that confirms this works.

@adamwill and @kevin, thank you for going through the details and doing builds.
I think we should just do as Adam wrote above.

PROPOSAL: Approve suggestion from Adam Williamson, i.e. wait for https://github.com/clalancette/oz/pull/310 and https://pagure.io/koji/pull-request/3989 to be merged and deployed, and update pungi-fedora to specify the image size as "16GB".

@zbyszek are you also proposing to accept my other suggestions? for clarity, let me restate them as a numbered list. I'll omit the workstation x86_64 live one as we can take that as already approved by @catanzaro .

  1. Bump the max size for Fedora-Everything-netinst-x86_64-RELEASE_MILESTONE.iso and Fedora-Server-netinst-x86_64-RELEASE_MILESTONE.iso to 1 GB (power-of-ten)
  2. Bump the max size for Fedora-Minimal-aarch64-RELEASE_MILESTONE-sda.raw.xz to 8 GB (power-of-ten)
  3. Bump the max size for Fedora-Workstation-aarch64-RELEASE_MILESTONE-sda.raw.xz to 16 GB (power-of-ten)
  4. Clarify that the max sizes for the .raw disk images apply to the uncompressed images
  5. Recommend oz, ImageFactory and Koji maintainers merge patches (oz, ImageFactory, Koji) to allow build of a Workstation image at the specified size (if this doesn't happen fast enough, we can patch downstream)

Thanks!

edit: I guess 3. only needs @catanzaro 's approval (on behalf of Workstation WG), also.

This was discussed during today's FESCo meeting.
AGREED: @adamwill's proposal above is APPROVED (+6, 0, 0)

Metadata Update from @zbyszek:
- Issue untagged with: meeting
- Issue close_status updated to: Accepted
- Issue status updated to: Closed (was: Open)

a year ago

Log in to comment on this ticket.

Metadata
Attachments 1
Attached a year ago View Comment