#11000 s390x builders not healthy, build ongoing for (new record!) 112 hours
Closed: Fixed with Explanation 2 years ago by kevin. Opened 2 years ago by catanzaro.

The build https://koji.fedoraproject.org/koji/taskinfo?taskID=94306095 is taking way too long:

  • The i686 build took 4 hours, but that's misleading because it didn't have to process any debuginfo.
  • The x86_64 and ppc64le builds each took 8-9 hours, which is fine.
  • The s390x build took 40 hours. I'm tempted to turn off debuginfo or use ExcludeArch here, but frankly I think we should turn off this architecture for the entire distro because it has zero value to the Fedora community. This architecture exists only for business reasons, and if the businesses do not to provide adequate build infrastructure, we should turn it off.
  • The aarch64 builder is still ongoing after over 43 hours. It has been "extracting debuginfo" for at least 20 hours now. This architecture has value to Fedora community and it would be a real shame if I need to turn off debuginfo or use ExcludeArch here.

Despite all that, I do notice that no builds have restarted due to OOM, which was a problem a few months ago, for which I am very grateful. I'd rather have slow reliable builds than fast unreliable builds.


As much as I also like to complain about the s390x, Fedora Infrastructure has no power to turn off an architecture. Please bring up the problem with FESCO and your internal Red Hat management chain.

Going from my poor memory, when aarch64 and s390x have long extraction times in the past, it has been problems with the tools themselves and needed to get the Red Hat toolchain team involved to fix. This is also more of an release engineering issue than infrastructure at that point.

It's basically maxxed out on memory:

               total        used        free      shared  buff/cache   available                     
Mem:           36001       34947         466           0         587         703                     
Swap:           8191        8191           0 

CC: @kalev perhaps we need to tune the debuginfo / dwz stuff a bit more?

For now, I moved the build to a buildhw machine with a bunch more memory. It will have to restart, but it should finish...

For now, I moved the build to a buildhw machine with a bunch more memory. It will have to restart, but it should finish...

Thanks. Can we either (a) permanently turn off all the builders with less memory, so this doesn't happen again, and just accept that there will be fewer builders, or (b) create the heavy builder channel that you didn't want, to stop jobs from being assigned to inappropriate builders? What was the disadvantage of using a heavy channel?

FWIW, I'm planning to drop one of the three builds next year (webkit2gtk4.0, the libsoup 2 build), so the RAM requirements should drop somewhat. But even still, seems clear the current builders are not healthy enough.

Unfortunately moving the aarch64 build was probably a mistake: the new builder ran out of disk space. :/

Metadata Update from @phsmoura:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: medium-gain, medium-trouble, releng

2 years ago

Note WebKitGTK 2.39.2 was just released, but I haven't been able to build 2.39.1 yet.

I think what might help here is dropping the _dwz_max_die_limit override in the package (but not the x86_64-specific _dwz_max_die_limit_x86_64 override, that one seems fine). We'd get bigger debuginfo packages, but if it makes builds faster I think it's worth it. Plus, most debugging is happening on x86_64 anyway and there we can keep the optimization as the builders are much stronger there.

At the same time, I think it would make sense to further reduce parallelism for debuginfo extraction (require 16 GB of RAM so that we only get one concurrent debuginfo extraction process on s390x). My theory is that running multiple debuginfo extraction processes causes the builders to page out to swap and that kills all performance.

I can do a PR if something like that makes sense to you, @catanzaro ?

That all sounds fine. I'm skeptical that it will be enough to make builds reliable without infrastructure changes for these architectures (and webkitgtk is hardly the only heavy package, after all), but couldn't hurt to try. I need to update the package to 2.39.2 anyway, so I'll try those suggestions now.

So Kalev's suggested changes actually worked better than I expected. The s390x build went 3x faster, 14 hours vs. 40 hours. But additionally, the debuginfo package size, which I had expected to balloon, remains pretty reasonable on all architectures. I'm going to test what happen if I remote the _dwz_max_die_limit_x86_64 limit now, since it seems like that override was not actually needed on other architectures after all.

aarch64 is still slow and not yet finished, but I'll just have to wait longer.

I'm going to test what happen if I remote the _dwz_max_die_limit_x86_64 limit now, since it seems like that override was not actually needed on other architectures after all.

The debuginfo is no longer optimized if I remove this, so I'll put it back.

BTW I'm going to increase the %limit_build from 16 GB to 32 GB, because with the increased DIE limit removed I see three dwz processes each using 20 GB of RAM, and it's going to need more than that with the limit lifted.

The debuginfo is no longer optimized if I remove this, so I'll put it back.

It seems the compression only reduces the debuginfo size by about 20% now. Based on the comment in the spec file, it used to be more like 80%.

Kevin decided to let us use the heavybuilder channel after all. He started a new build here:

https://koji.fedoraproject.org/koji/taskinfo?taskID=94763285

Unfortunately, the builder ran out of disk space.

This should be all fixed now. Kevin cleaned up the disk on the two heavybuilder aarch64 machines and that was the last blocker to get the builds going again. The latest build took only 15 hours and 14 minutes and ran without issues.

It might be worth carving out two heavier s390x builders in the future as these seem to be the slowest right now compared to other arches, but the current setup seems to be sufficient for now.

Thanks a lot, nirik!

Metadata Update from @kevin:
- Issue close_status updated to: Fixed with Explanation
- Issue status updated to: Closed (was: Open)

2 years ago

Reopening. aarch64 seems fine now, but task https://koji.fedoraproject.org/koji/taskinfo?taskID=95415428 has been building on the s390x heavybuilder for over 42 hours, despite all of the above effort to reduce RAM required. I don't think there's anything more we can do on dev side other than disable debuginfo entirely, which is not a good solution.

Metadata Update from @catanzaro:
- Issue status updated to: Open (was: Closed)

2 years ago

That builder was completely unresponsive. I rebooted it (and of course that restarted the build). Lets see if it completes now...

The restarted build has been ongoing for 44 hours now. Can you please investigate to see what is wrong with this builder? We have extremely conservative resource limits set now (one job per 3 GB of RAM during the build, one job per 32 GB of RAM when processing debuginfo) so it's unlikely that further changes to the spec file will be useful.

Kevin has been and is on PTO til next year... so I am going to look at this. Currently the builder is again fairly non-responsive. I get a root login but no password entries which leads me to believe it is OOM or otherwise fairly dead.

I have rebooted the builder and installed tmux on the virt-server. I have consoled into the vm and am trying to watch what goes on in a journalctl -f on the box.

In /var/lib/mock there are two trees created about the same time: f38-build-40026864-4928976 and f38-build-40068088-4928976. Currently /usr/libexec/gcc/s390x-redhat-linux/12/cc1plus is building various files on the box and the load average is 3. I will keep the screens open to see what I can track

Tracked down the problem to dwz using all RAM+swap on the buillder

 62919 kojibui+  20   0   25.2g  12.5g      0 D   0.3  75.3  46:01.21 dwz 

somewhere as it used up all memory and networking and a dozen other applications seem to die. The system only has 16GB of real memory and 8 GB of swap and needs about 1 GIG for keeping things going. So if we can keep dwz down to 22 GB the build should work fine.

The system was highly unresponsive in this state until I killed the dwz process which then allowed systemd and other things to come back.

basically load average is now at 20+ and many other

Would it be possible to split out two s390x builders that have more RAM and CPU and disk than the rest and add them to the heavybuilder channel (and remove the smaller builders from heavybuilder)? I think two should be sufficient because there are only two x86_64 heavybuilder builders and I think it makes sense to have the same number on s390x.

This may need to wait until next year when Kevin is back though.

It will need to wait until Kevin is back and it will need planning probably at a higher level since things are very resource constrained all over the place.

For x86_64 there are 16 hardware only builders which have large amounts of memory and are useful for certain large builds.

For aarch64 there are 8 systems which have large amounts of memory/cpus BUT they are also loaner/no-support/prototype hardware which could be returned or die at any time.

For ppc64le there are no extra systems with builds split between 4 hardware boxes and 40 vm's

For s390x there are 2 z machines with 30 vm's with various levels of virtual systems on top of it, of which there is only 1 which could be resplit into larger vm's. This is the most resource constrained and currently the one which holds up daily builds and composes for many different developers. Of the 30 vm's one is dedicated to being a varnish cache for builds, and one or two are dedicated to being for composes. Removing 4-6 s390x builders to make 2 heavy builders for a subset of maybe 8 packages causes a larger backup on the remaining 23 builders AND also starts a bureaucracy headache: Who gets to use the big builders?

Because every time any release engineering group creates a set of builders for 'bigger' or 'faster' builds nearly every developer wants their packages in that group. They then start complaining that whatever releng is playing favorites on which ones are allowed and it ends up being a political mess. At this point, management usually steps in and starts a process where only approved packages by some outside group are in the 'fast' lane and any fighting about what is there is done there.

[do note that the following is just a suggestion from someone who has seen this kind of stuff in many places over many years.. however I am just a volunteer here and have no power to enforce this to be the case.. I just don't want Kevin stuck with a crap show.]
So I suggest that if there is a proposal to 1) cut down the amount of available builders for 'normal' packages and 2) create bigger builders for 'special' packages.. that the proposal also covers who gets to decide that, how they get to decide it and then get some outside group to agree that the proposal is good and followed.

I understand the trouble here, but 16 GB of RAM and 8 GB of swap is just not enough to be a Fedora builder. Not enough for the regular build channel, and certainly not nearly enough for the heavybuild channel. Our spec file documents this:

# Require 32 GB of RAM per vCPU for debuginfo processing. 16 GB is not enough.
%global _find_debuginfo_opts %limit_build -m 32768

If you don't want to make infrastructure changes until next year, my proposal for now to unblock this 112 hour ongoing build (I see the s390x builder restarted again two hours ago) is to disable debuginfo for s390x temporarily. Then the build will succeed for sure, but this in not an acceptable long-term solution: building packages without debuginfo is really really bad. So we should only do this temporarily until holiday season is over. In January, we should either consolidate the s390x infrastructure or else change this to ExcludeArch: s390x, which would need to be added to absolutely everything that depends on webkitgtk, including gnome-shell and a bunch of other GNOME packages. This will be messy, but better than disabling debuginfo IMO. Are you OK with this, Kalev?

My opinion is that we should make the builders large enough to be reliable, and accept that there will be a queue to build if there's not enough builders. Builders with only 16 GB of RAM that cannot be resized should be outright rejected and removed from Fedora infrastructure without replacement, even if it causes longer queues to build stuff. And if there are no corporate sponsors willing to donate required infrastructure, then we should ask FESCo to consider turning this architecture off altogether.

Currently everyone is on break and there is one volunteer sysadmin (me). I am not going to be making any changes to the builders or koji at this time. I would suggest that turning off debuginfo for the time being is your best bet.

Disabling debuginfo (-g1 instead of -g) sounds like a good plan to me.

Disabling debuginfo (-g1 instead of -g) sounds like a good plan to me.

-g0 will disable debuginfo.

-g1 gives you bad debuginfo, which is better than nothing. I see that's actually what we're using on i686, though, underneath a misleading comment that says it's removing debuginfo, so maybe that will suffice. :P I wonder what the history is behind that comment....

Yes, what I tried to say was to use -g1 instead of -g0 because with -g1 we'll still get some debuginfo, which is better than nothing.

I see you changed it from -g0 to -g1 and just forgot to update the comment. That's fine. We used -g1 for ages and I only switched it to -g0 about a year ago because the i686 builds kept hitting OOM with -g1. Clearly that's not happening anymore, so no reason not to go back to -g1, unless it starts to fail again in the future. I've updated the comment in the spec file to match that we're using -g1 again.

Using -g1 for s390x (and i686), all builds finished after only 6 hours, which is impressive. So this workaround will suffice until after holidays.

I'm going to retitle the issue to indicate that aarch64 is healthy now, and only s390x remains problematic.

Hi @kevin, any plans to adjust the s390x builders?

yes. I hope to do that soon/before mass rebuild.

ok. This is now done.

Here's what I did:

I nuked buildvm-s390x-30, 29, 28, 27, 26

I used the memory/cpu/disk from 29 and 28 to double 26 and 27

26 and 27 now have 34gb memory, 6 cpus and 200G disk

I left the resources from 30 freed for the hypervisor (we were overcommitted on memory before).

Finally I removed all the builders except 26 and 27 from the s390x heavybuilder channel.

I have left both in the general 'default' channel too, so they will do other builds. It might be that your builds will have to wait for something else to finish. I would hope it's not a very long wait usually though.

I don't know if the added cpus will cause problems (by doing more threads and taking up too much memory). We can remove cpus or you can adjust the nprocs in the spec to only use a subset if needed.

I guess I will close this now and when you do your next builds, please let us know how it goes and reopen if you need anything more or this doesn't work.

Metadata Update from @kevin:
- Issue close_status updated to: Fixed with Explanation
- Issue status updated to: Closed (was: Open)

2 years ago

I think this should be good. I'm OK with waiting for builds to start in exchange for them finishing reliably, and 34 GB should surely be enough to achieve that.

I don't know if the added cpus will cause problems (by doing more threads and taking up too much memory). We can remove cpus or you can adjust the nprocs in the spec to only use a subset if needed.

It would have been a big problem before we had %limit_build to compute a safe parallelism level. But now we do, so no worries.

Thanks, Kevin! That all sounds perfect.

I think we should also keep an eye on the upcoming mass rebuild and see if s390x have become a bottleneck.

Log in to comment on this ticket.

Metadata