I have some mostly working code to add build ordering, but it's still rough and intertwined with the changes from PR#47, so I'll file this issue for now to get my ideas out and get some feedback.
A basic problem with trying to build order the Fedora package set is that it in general does not build order well - circular build dependencies are all over the place. However, it can be legitimate to ignore some BuildRequires when figuring out an order:
A directed graph is created with the nodes being the SRPMs being rebuilt and the edges indicating that building A requires a binary package built from B to be installed. Edges are annotated with a list of BuildRequires of A that result in B being installed. We'll use an example of: fedmod resolve-deps --build-order --json pango --module=platform --module=X11-base --module=fonts. The initial set of edges is:
fedmod resolve-deps --build-order --json pango --module=platform --module=X11-base --module=fonts
cairo: cairo pango harfbuzz libthai graphite2 libdatrie graphite2: cairo pango harfbuzz libthai graphite2 libdatrie harfbuzz: cairo graphite2 libdatrie: libthai: libdatrie pango: cairo harfbuzz graphite2 libthai libdatrie
The user can specify (via yaml file, likely) that certain BuildRequires should be ignored. If all the BuildRequire that contribute to an edge A=>B are ignored, then the edge is deleted.
If, after the specified ignores, cycles are found in the BuildRequires graph, then the cycles are displayed (possibly only the first N shortest cycles) and the modulemd generation errors out. In the example, if nothing is ignored, ordering errors out for cycles including:
graphite2 =[asciidoc]=> graphite2 cairo =[librsvg2-devel]=> cairo pango =[pkgconfig(harfbuzz) >= 1.2.3]=> graphite2 =[asciidoc]=> pango ...
(By graphite2 =[asciidoc]=> graphite2, I mean graphite2, via a BuildRequires on asciidoc, causes graphite2 to be installed in the buildroot.)
graphite2 =[asciidoc]=> graphite2
These cycles (and all other cycles for this set of packages) are fixed by specifying:
ignore-buildrequire: # librsvg2 is used only for a test program cairo: ['librsvg2-devel'] # graphite uses asciidoc for documentation generation. graphite2: ['asciidoc']
Which reduces the set of edges to:
cairo: graphite2: harfbuzz: cairo, graphite2 libdatrie: libthai: libdatrie pango: cairo, harfbuzz, graphite2, libthai, libdatrie
If no cycles are found, the graph is used to generate a build order, by the principle that each module is built as late as possible. The last batch is everything that nothing else depends upon, the next to last batch is everything that only the last batch depends upon, and so forth.
For the example, the following buildorder is generated
Batch 0: libdatrie, graphite2, cairo Batch 1: libthai, harfbuzz Batch 2: pango
The problem with buildorder: is small changes can renumber everything. Rather than specifying:
components: cairo: { ref: f27, buildorder: 0 } graphite2: { ref: f27, buildorder: 0 } harfbuzz: { ref: f27, buildorder: 1 } libdatrie: { ref: f27, buildorder: 0 } libthai: { ref: f27, buildorder: 1 } pango: { ref: f27, buildorder: 1 }
It would be better to specify something like:
components: cairo: { ref: f27 } graphite2: { ref: f27 } harfbuzz: { ref: f27, buildafter: [ cairo, graphite2 ] } libdatrie: { ref: f27 } libthai: { ref: f27, buildafter: [ libdatrie ] } pango: { ref: f27, buildafter: [ harfbuzz, libthai ] }
and let the module build service figure out the exact ordering and batches.
Tagging @ralph as I know the fm-orchestrator/MBS folks have also been thinking about this problem (e.g. in https://pagure.io/fm-orchestrator/issue/715 )
As additional potential sources of test cases, I'll point to https://github.com/sclorg/rhscl-rebuild-recipes and my own Python 3 bootstrapping recipe at https://github.com/ncoghlan/pyscl-devel/blob/master/rpmlb/python-recipe.yml
I think putting the buildorder algorithm into fedmod is the wrong approach. MBS is where all modules have to pass through, build ordering belongs there so that every module benefits from it, not only those that get created/updated with fedmod.
fedmod has a big piece of information that isn't available to MBS - the existing repodata for the global Fedora package set. The goal of rpm2module and (to be added) rpm2flatpak is to take existing packages from the global Fedora package set and modularize them.
Given that repodata, we can do an actual computation of build ordering.
MBS would have to fall back to some combination of: Trial and error - which doesn't always work Heuristic parsing of spec files to try and figure out dependencies * Extra mock builds to try and figure out dependencies (see https://pagure.io/fm-orchestrator/issue/715#comment-504462)
It seems like a really hard problem.
Log in to comment on this ticket.