#12395 improved gitlab runners
Opened 2 months ago by walters. Modified 2 months ago

So we have https://gitlab.com/fedora
However, the free/stock gitlab runners are very underpowered.

I think I have admin privileges under my sub-namespace in https://gitlab.com/fedora/bootc/ and
my employer (Red Hat) sponsors significant investment in Fedora, so I could go and set up custom runner infrastructure, but that wouldn't be shared with anything else there.

An interesting thing about Gitlab (at least hosted) is it's quite opinionated that CI jobs are containers. A common pattern then is to have those containers have credentials to spawn remote resources (e.g. VMs in AWS or pods in a remote OpenShift cluster, etc.).

So there's lots of possibilities here. A simple possibility here is that we set up an OpenShift namespace automatically per namespace in gitlab, and provision credentials in the projects such that the executed containers can create pods.

Another option is to do the same, but for a cloud like EC2 or GCP.


I know there's a lot going on and of course this heavily intersects with the Fedora git forge decision https://fedoramagazine.org/fedora-moves-towards-forgejo-a-unified-decision/

And I guess I just have basic questions like, will we decomission gitlab.com/fedora and move everything to forgejo? And what will CI look like there?

one is we reuse e.g. the existing Fedora OpenShift instance, and set up credentials such that the


All good questions! ;)

Note that https://gitlab.com/fedora is a "fedora is currently using the [OSS] Ultimate SaaS Plan"
ie, provided by gitlab.com to communities. I don't know what that means for runners off hand tho. We could definitely provide ec2 resources for this.
I'm not sure we want to run things directly in our openshift unless we can be sure the project/namespace seperation is good enough to not impact all our tructed applications. Especially since this will possibly be unreviewed content (pr's and the like?).

CC: @sgallagh for thoughts

On if we want to move things off gitlab to a forgejo instance, I think thats all still being discussed. The current thought was that we have an instance for src.fedoraproject.org that includes packages and only stuff thats used to build official images. But then have perhaps another one for community stuff.

CC: @humaton for thoughts

All good questions! ;)

Yeah, again to be clear I wasn't expecting some immediate action, more future guidance.
(Although it could turn out that someone already set up a nice gitlab runner system on gitlab.fedora elsewhere and we just needed to be wired into it too, but I guess that's not the case)

The real answer here is probably Konflux - since AFAIK RHEL (and CentOS) will continue to use gitlab we will have to maintain that Konflux-gitlab integration (which has its issues, but at least the machine and environment are under "our" control).

So there's lots of possibilities here. A simple possibility here is that we set up an OpenShift namespace automatically per namespace in gitlab, and provision credentials in the projects such that the executed containers can create pods.

That seems like a LOT of overhead to build something that sounds a lot like the GitLab Runner Operator for OpenShift. I could help get that set up on Fedora's OpenShift instance quite easily; I maintain the one we're using for the CentOS Stream pipelines.

Note however that the OpenShift-powered runners have some limitations because of the security model; there are a number of things (including, notably, Kaniko) that don't work because OpenShift locks down some kernel capabilities on its pods. For those cases, we've usually resorted to creating a "pet" EC2 instance and loading the Gitlab Runner software onto it manually.

Note that https://gitlab.com/fedora is a "fedora is currently using the [OSS] Ultimate SaaS Plan"
ie, provided by gitlab.com to communities. I don't know what that means for runners off hand tho. We could definitely provide ec2 resources for this.

We can connect whatever runners we want, up to 1000 unique ones per project or group (so, theoretically up to 2000 for any given project).

I'm not sure we want to run things directly in our openshift unless we can be sure the project/namespace seperation is good enough to not impact all our tructed applications. Especially since this will possibly be unreviewed content (pr's and the like?).

Pipelines won't run on our runners unless a person with Developer or higher role in the Gitlab project has initiated it. Pipelines may run on the proposer's fork, but it will only have access to whatever runners they have there (probably the Gitlab SaaS shared runners). We can do something similar to what we have in CentOS Stream and have a check for whether the pipeline is running in a fork and explain how the maintainer can choose to run it in the destination repo if and only if they've reviewed it for malware first. Something similar to this: https://gitlab.com/packit-as-a-service/fido-device-onboard/-/jobs/9075290069

That seems like a LOT of overhead to build something that sounds a lot like the GitLab Runner Operator for OpenShift. I could help get that set up on Fedora's OpenShift instance quite easily; I maintain the one we're using for the CentOS Stream pipelines.

Anything that keeps us in sync with what CS is doing is a big :thumbsup: from me, especially if it's not hard. I would very much appreciate it!

there are a number of things (including, notably, Kaniko) that don't work because OpenShift locks down some kernel capabilities on its pods.

User namespaces are a dramatic improvement in security and capability, and make "nested builds" much much saner. And simple "run this pod as root so I can dnf install" really.

Also since we're talking about the Fedora OCP instance at least one of which is on bare metal it should be possible to wire things up such that these pods get /dev/kvm which would be really nice for OS level testing.

Metadata Update from @phsmoura:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: medium-gain, medium-trouble, ops

2 months ago

I just noticed https://gitlab.com/fedora/infrastructure/konflux/infra-deployments/-/issues/7#note_2333023867 is also related, so maybe we can just try to go all in on Konflux from our side.

Log in to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog