#10553 Koji hates Chromium
Closed: Fixed 3 years ago by kevin. Opened 3 years ago by spot.

To be fair, I don't blame it, but it would be nice to be able to build Chromium again.

Currently, it fails while opening 29 buildroots without any obvious failure in the build logs, see:

https://koji.fedoraproject.org/koji/taskinfo?taskID=82556973
https://koji.fedoraproject.org/koji/taskinfo?taskID=82572340

It does seem specific to x86_64, the F35 aarch64 build succeeded. nirik suggested that perhaps the OOM killer is killing kojid. Can AWS buy some more memory for your koji servers? :D


So, I forced the f35 one onto a buildhw (hardware box) and it finished.

The others finally failed... I am going to rebumit one and watch it's logs and see if I can see whats going on.

Metadata Update from @zlopez:
- Issue tagged with: koji, medium-gain, medium-trouble, ops

3 years ago

Looks like the EL7 build is having the same failures.

yeah, and it's puzzling. I did a tail -f of the build log on a builder... and it processed along, until it just stopped. No errors or anything in the log or in the builder logs. :(

Will keep digging.

Metadata Update from @kevin:
- Issue assigned to kevin
- Issue priority set to: Waiting on Assignee (was: Needs Review)

3 years ago

ok, it is being oom killed... by systemd-oomd, which I thought I had disabled. ;(

Feb 15 06:09:11 buildvm-x86-25.iad2.fedoraproject.org systemd-oomd[612]: Killed /system.slice/kojid.s
ervice due to memory used (15411052544) / total (15704350720) and swap used (10640809984) / total (11
811151872) being more than 90.00%
Feb 15 06:09:12 buildvm-x86-25.iad2.fedoraproject.org systemd[1]: kojid.service: systemd-oomd killed 
36 process(es) in this unit.
Feb 15 06:09:12 buildvm-x86-25.iad2.fedoraproject.org systemd[1]: kojid.service: Main process exited,
 code=killed, status=9/KILL

I disabled it on the builder that it's running on now, will see how it does in the morning...

ok, That didn't help any.

I think the problem is one we often hit with these big projects: The buildvm-x86 vm's have 5 cpus and 15gb memory. So thats 3 for 5 threads running at the same time. If one of those threads goes over 3GB memory use, boom, OOM takes it out.

I am not sure why it would suddently be happening now tho. Are there any changes in chromium that would cause the compile to take a bunch more memory?

In any case, as a workaround, I removed the buildvm's from the channel chromium uses, it's just the buildhw-x86 boxes in there now.
Can you re-submit and confirm that they all complete now?

I went ahead and resubmitted that epel7 one. The older failed ones looked like they might have been for other reasons, so I left them alone.

I see successful builds... so I think this is working around things for now. Please re-open or file a new ticket if it's not...

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 years ago

Log in to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog