#6877 Refine cleaning up packages with broken deps
Closed: Grooming 8 months ago by jnsamyak. Opened 7 years ago by till.

Currently the Final Freeze is the deadline to remove packages with broken deps in Fedora Branched (see https://fedorapeople.org/groups/schedule/f-26/f-26-releng-tasks.html).

In the past there are still packages with broken deps or depending on broken deps on some architectures that are important for the release. Retiring these packages at the Final Freeze will cause problems for the actual release (examples for F26: java-1.8.0-openjdk, python-etcd). Only skipping these packages from the cleanup process is an error-prone task since there is no canonical list of packages that are important for the release. Also the addition of previous secondary archs to the primary archs created more broken deps. These broken deps could be fixed trivially by excluding the packages from the respective archs but there seems to be too little incentive for package maintainers to do this in time for the final freeze. I see several possible alternatives:

1) Do not cleanup packages with broken deps anymore. This will not cause any direct problems but will cause the distribution to pile up with more broken packages. Also it will create a bad user experience when people try to install packages from the repos and it just fails

2) Remove packages with broken deps at a seperate milestone: This would allow for more time to re-introduce removed packages after cleanup when people discover that they miss them. Possible dates are for example for branching. The downside is that there might be new broken deps at final freeze again. To address this there could be a second round at final freeze.

3) Remove packages with broken deps more regulary. for example whenever a packages has broken deps for six weeks they get retired. This has the advantage that it is a continuous process and therefore keeps the set of affected packages small and distributes the work over a longer period of time. The disadvantages are that there might be less time to fix packages with broken deps and it might be too much work after a mass rebuild breaks a lot of packages. Also we do not yet keep track of how long packages have broken deps. And this might still allow for packages with broken deps to slip into the final release.

Other possibilities are to refine the set of packages that are cleaned-up.

1) We could do this only for packages with broken deps in release blocking archs. This might contribute to people not caring enough about the other archs to actually fix their package there.

2) We could do this only for packages with broken deps in all archs. Similar disadvantage as 1).

3) We could do this only for packages where some criteria on the type of the package fits (e.g. only if is not a leaf subpackage that has a broken deps). This might be hard to express in code for automation, though.

4) Do not do this for packages that are important for the release. This might also be hard to express in code.

Another option that came to my mind would be to change the cleanup method by not retiring the pkgs directly but by orphaning them and then let the orphan cleanup process take care of them eventually. Then other packagers that the original maintainers get a chance to fix the package and the grace period could be easily extended by packagers if they need more time until they get to fix a broken package.

Also we could only retire the actual packages with broken deps and then use the fallout reports to retire the addtional packages with broken deps after a grace period to allow for packagers to remove optional dependencies from their packages. Depending on the grace period this might mean that it might take too long to cleanup branches of packages with broken deps nobody cares about anymore.

Additionally if we continue with cleaning up broken deps it might a good idea to also notify the owner of packagers that depend on packages with broken deps more often about their package possibly being affected. For example when python-etcd has a broken dependency, the maintainers of custodia and freeipa did not get as many notifications as the python-etcd maintainres. However, daily notifications for all depending packages might also be too much when packages such as java get broken.

I am currently in favor of doing more regular cleanups to distribute the workload of the full development cycle with additional cleanups at important milestones. I am not yet decided about which cleanup procedure I would favor (direct retirement vs. first orphaning) and whether to direclty cleanup depending packages as well.


Metadata Update from @mohanboddu:
- Issue tagged with: meeting

7 years ago

My first preference is for option #2 — remove them from the branch when branching — but rather than having a second round, gate any changes which cause broken deps in the branch. I know it might take us a while to get to there, and I'm willing to live with option #1 (don't clean stuff up) until then.

My second choice would be periodic cleanup in Rawhide, without messing with the branch. This could actually also be combined with the above.

I would prefer removing broken dep packages after braching from rawhide and dont do another round since we will busy with freezes later on in the release cycle.

2) Remove packages with broken deps at a seperate milestone: This would allow for more time to re-introduce removed packages after cleanup when people discover that they miss them. Possible dates are for example for branching. The downside is that there might be new broken deps at final freeze again. To address this there could be a second round at final freeze.

and do it on release blocking arches

1) We could do this only for packages with broken deps in release blocking archs. This might contribute to people not caring enough about the other archs to actually fix their package there.

and a notification to non-release blocking arches

2) We could do this only for packages with broken deps in all archs. Similar disadvantage as 1).

I'm mostly with @mattdm here, but I have another important point to add. The 'clean up' must be done such that the package no longer shows up in nightly composes. It's very important that nightly and candidate composes be as similar as ever we can manage; we do most of our testing on nightlies, now, and the contents of nightly composes are what people who actually have the Branched release installed are using. In this recent case, the 'clean up' was implemented by simply blocking the packages from the f26-compose tag, which had the effect of meaning they would not show up in candidate composes - that is, the production-type composes, with labels, that we request and that releng manually runs - but did still show up in the nightly composes. This meant that we had no opportunity to encounter any of the consequences of the 'clean up' until we happened to build a candidate compose, as nightly composes were not affected. This leads to confusion and reduces our ability to find and promptly fix problems.

Before we embark on changes to the cleanup process, could we spend some resources on making the notification e-mails more informative:
- the way that dependencies for dependent packages are displayed is wrong:
the e-mails say "package foo depends on package bar-123-1.fc26.x86_64", when they should say "package foo depends on bar, which is only provided by bar-123-1.fc26.x86_64"
- in the latest round most issues came up because dependencies were broken on some fringe architectures. It'd be great if the e-mails could be direct about that "bar-123-1.fc26.x86_64 requires barbar, but barbar is not available on arm64, s390x".
- the e-mails should also be clearer about which tags were used to compute the dependencies (e.g. f26 stable). Ideally, deps would be calculated using updates-testing, so that warnings about "broken" packages which are going to be fixed soon anyway are not emitted.

If the tooling was improved like that, we would avoid people wasting their time on the mailing list trying to figure out what their packages are being reported, and would also encourage people (proven packagers and others) to directly fix some of the issues. For example, for packages which clearly need an ExcludeArch, this is something that could be trivially done by any pp.

"Ideally, deps would be calculated using updates-testing, so that warnings about "broken" packages which are going to be fixed soon anyway are not emitted."

I disagree with this. Things from u-t are not guaranteed to reach stable at any particular time, or really at all...

Well, they almost always do, or if they don't they are replaced by an update which has more not less fixes, and is built against newer dependencies. It'd be OK to check the dependencies without u-t, but if the dependency is missing, and is fixed by a package in u-t, the message should mention that.

If there's a package in u-t, then people should fire up dnf update and easy-karma, and if there's no such package, they should fire up fedpkg and the editor… Different courses of action, and if the messages gave a hint like that, it'd help to move things along faster.

1) We could do this only for packages with broken deps in release blocking archs. This might contribute to people not caring enough about the other archs to actually fix their package there.

and a notification to non-release blocking arches

What kind of notification are you thinking of? Currently package maintainers get daily notifications (for each nightly compose afaiu) when there package has a broken dependency. Do you mean to notify the package maintainers of depending packages here?

Branched release installed are using. In this recent case, the 'clean up' was implemented by simply blocking the packages from the f26-compose tag, which had the effect of meaning they would not show up in candidate composes - that is, the production-type composes, with labels, that we request and that releng manually runs - but did still show up in the nightly composes.

This was an exception that was decided on a releng meeting. Unfortunately nobody identified this as a big problem at that meeting. However now that this problem is clear I am sure it will not happen again. It had the advantage of allowing to remove packages without making it to hard to add fixed update. When the packages are retired the process to add a fixed update requires a lot more steps.

Before we embark on changes to the cleanup process, could we spend some resources on making the notification e-mails more informative:
- the way that dependencies for dependent packages are displayed is wrong:
the e-mails say "package foo depends on package bar-123-1.fc26.x86_64", when they should say "package foo depends on bar, which is only provided by bar-123-1.fc26.x86_64"

Which e-mails do you mean here? The compose report mails and the retirement notification mails both do not contain such sentences:

https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/3BRTWLKG4KATD2UIXFSYPIWAYN5TDQZS/
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/5HRI6V66F3HHG54VEMPFP4B2CDM3EFH6/

They both contain the raw/technical dependency afaiu:
asterisk-gui-2.0-13.20120518svn5220.fc26.noarch requires asterisk = 13.9.1-1.fc25.1
OmegaT-2.6.3-2.fc22.ppc64 requires hunspell <= 0:1.4.0

Also the individual notifications for packages about broken dependencies do not resolve the dependency to the NVRA.

  • in the latest round most issues came up because dependencies were broken on some fringe architectures. It'd be great if the e-mails could be direct about that "bar-123-1.fc26.x86_64 requires barbar, but barbar is not available on arm64, s390x".

Yes, I also hacked together a little script to parse for which archs a packages is broken according to the compose status reports.

  • the e-mails should also be clearer about which tags were used to compute the dependencies (e.g. f26 stable). Ideally, deps would be calculated using updates-testing, so that warnings about "broken" packages which are going to be fixed soon anyway are not emitted.

When looking through the broken packages I found several updates fixing broken deps that were still in testing but could have been in stable for weeks now.

If the tooling was improved like that, we would avoid people wasting their time on the mailing list trying to figure out what their packages are being reported, and would also encourage people (proven packagers and others) to directly fix some of the issues. For example, for packages which clearly need an ExcludeArch, this is something that could be trivially done by any pp.

I also failed for fixing at least one package this way and then someone suggested that it should be retired because it is also dead upstream. This made me believe that it is better to just retire packages that could be fixed trivially but were not fixed by their maintainer. Nevertheless, I agree that the reports should be better.

Which e-mails do you mean here?

You're right. I was confused by the java-1.8.0-headless deps, which contain the version in the package name. So yeah, it seems that the e-mails provide all the right info.

This made me believe that it is better to just retire packages that could be fixed trivially but were not fixed by their maintainer

Hmmm. I understand the sentiment, but I think that's just not feasible from the point of the distro: the original mail from Jun 26 listed 1537+87=1624 packages to be retired. That's just too many.

Nevertheless, I agree that the reports should be better.

I wasn't trying to be overly negative here. I know it's a hard issue.

IMHO, provenpackagers should actually go and fix the breakage wherever possible. I remember I and 1 or 2 other provenpackagers used to do that, but not only did nobody else offer help or even thank us, but some people outright claimed that it was not helpful. So nobody does it anymore, and so the broken dependencies pile up.

We had a discussion about this ticket in our releng meeting today and there is no good way of solving this issue, but we came up two options:

  1. Blocking the pkgs at branching and unblock them as necessary, pkg maintainers will request to unblock them and releng will review them and unblock them. Advantage is that we will more aware of whats got blocked and whats got unblocked. But it needs releng handling the tickets and we are not sure how many will show up per release cycle.

  2. Using bodhi with greenwave to block pkgs. And config waiverdb to waive certain pkgs even if they are having dep issues so that they wont be blocked. And other pkgs can be blocked which can be unblocked by either releng or the pkg maintainer. We can configure it how ever we want like allowing only releng to unblock them when pkg maintainer files a ticket.

So, we are not sure at this point which way we will go but we are open to other suggestions. I will send an email to devel list to get more opinions.

@mohanboddu confirms that both options are viable. He will send this to devel list now to get opinions.

From our grooming discussion on #fedora-releng channel on Apr 12 2019

proposal: make it blocked by broken deps work ticket, update to say we need that as a prereq

This is blocked by https://pagure.io/releng/issue/7931

From the same meeting:

[14:36:27] <+nirik> anyhow, also if we are going to do 1 for f31 we should note it in ticket and make noise about it to fesco/devel list...
[14:36:49] <+mboddu> Once we have a fix for the dnf issue
[14:36:54] <+nirik> right.
[14:37:02] <+mboddu> I will try the force option again next week
[14:37:02] <+nirik> so, lets wait for that first
[14:37:18] <+mboddu> And see if it gets fixed and if not, I will ping the dnf people again

Metadata Update from @syeghiay:
- Issue tagged with: waiting on external

5 years ago

So this issue depends on #7931 and that has a BZ attached to it. In that BZ there is a workaround for the underlying issue with repoclosure https://bugzilla.redhat.com/show_bug.cgi?id=1565257#c5

This is a really old ticket. Currently, we track FailsToInstall packages for every release, technically it is not just broken deps. But it includes them.

The idea is to automate or extend the script to directly orphan packages that meet the criterium.
Or write a new toddler that will check the FTI tracker and blocking BZ's to orphan packages that are broken.

Fully automating the bugzilla creation would be needed as well.

Hello folks, we discussed this in our releng meeting today, and this ticket is really old and outdated with some information! But in case, we want to work on this let's come up with the problem statement with updated info in the new ticket on how to improve this, a plan, and if someone is willing to pick that up! Because having this older ticket opened for so long is not helping!

Metadata Update from @jnsamyak:
- Issue close_status updated to: Grooming
- Issue status updated to: Closed (was: Open)

8 months ago

Log in to comment on this ticket.

Metadata