#6950 Redirector service and pipermail URLs
Closed: Will Not/Can Not fix 6 years ago Opened 6 years ago by abitrolly.

  • Describe what you need us to do:

There are links to old pipermail archives on the internet. For example https://lists.fedoraproject.org/pipermail/packaging/ mentioned in https://pagure.io/packaging-committee/issue/768

The request is to add redirector service that can be a front running proxy handling URLs like that and collecting stats about usage and referrals.

  • If we cannot complete your request, what is the impact?

Having broken links and outdated info is not a good image for a project, and contributors may stay away from it.


Metadata Update from @mizdebsk:
- Issue tagged with: lists

6 years ago

So, when we migrated from mailman2 -> mailman3 we looked at redirecting things, but we decided not to for several reasons:

  • The urls are vastly different. mailman2 has a number based on when the archiver saw the email, and mailman3/hyperkitty has a UUID generated when the message is archived. Mapping converted posts would be difficult.
  • The interface is very different as well. Some people linked to various parts of the old interface (like a particular month or thread view or the like) that has no direct thing in mailman3/hyperkitty.
  • We did decide to redirect the top level (ie, if you go to /pipermail/ it redirects you to hyperkitty) but if we redirected further down we would have to somehow exclude those things people are linking to in the old archives.

Overall I don't think it's worth trying to redirect anymore of this. If it's easy it might be worth adding something to the old archives noting that they old and providing a link to hyperkitty, but not sure how much work that might be.

I'll add meeting here and we can discuss this in our next meeting...

Metadata Update from @kevin:
- Issue priority set to: Next Meeting (was: Needs Review)

6 years ago

Is it possible to uniquely identify emails in both archives? Using embedded ids or timestamps? Then it should be possible to detect_message_id() from pipermail URL and then query Hyperkitty for new URL for this id.

Months and threads pages may have top banners injected, but stats and referrals would be good.

The idea here is still a generic redirector service that can be run for different services and mailman upgrade config could be shared fordifferent mailman installs.

I used to manage a number of mailman2 installations, but I've not done so since the transition to mailman3. Hopefully the following is still helpful (and not wildly inaccurate).

Individual messages in the generated pipermail archives (from mailman2) don't contain the message-id, which is needed to generate the archive URL in hyperkitty (from mailman3). The message-id could probably still be found in the mbox files the pipermail archives were generated from (assuming they have been kept). That still doesn't tell you the pipermail archive URL for the message. You would have to try and create a mapping based on the subject, sender, timestamp, and other data in each pipermail message entry, look that up in the raw mbox, extract the message-id and then you could map that to the hyperkitty archive URL.

Finding the message based only on the data in the pipermail archive is bound to be inexact. That's the largest hurdle, in my opinion. If you can't get good data to start with, you can't build anything further on top of it. :)

It's not an impossible task, but it's a lot of effort for a questionable return on the investment.

It seems far better to simply fix any links which point to the generic pipermail archive for a list with the hyperkitty URL. Trying to fix links to the old archives gets you very little since the old archive links should still function.

As far as adding a note to the existing archives, there are templates which can do some of that, but I believe that all of those templates require the archives to be regenerated to apply any changes. Doing so will almost surely break many/most of the links to individual messages in the pipermail archives. (One of the many reasons pipermail was replaced was that message links were not stable.)

Maybe someone could run a search and replace on all the files in the now-static pipermail archive to add such header/footer data. That again carries with it some reasonable risk of causing problems if the search/replace isn't perfect -- for little gain.

Users browsing the pipermail archives can follow the links included up to the list info page, which includes a link to the current archives.

Applying machine learning should be possible to extract data about message similarity. What fields in messages from pipermail and Hyperkitty match exactly? Maybe some timestamps?

The workflow for particular pipermail redirector is basically 1. extract fields from pipermail page from specified URL, 2. lookup fields to find the message in Hyperkitty, 3. provide user with the choice to go to Hyperkitty page, 4. register referer

I can't speak for the Fedora Infrastructure team, but I'd say that writing an app which uses machine learning to manage these redirects is far out of scope for the project in terms of the time and effort required to create and run such an application. And it's of questionable benefit.

Of course, if you have written such and application and can help get it packaged it for Fedora, perhaps it could be worth considering. :)

I looked through all of the fedora-websites repo. All of the references to the pipermail archives are commented out. The only way I can think of for someone to find that page is via a link from an archived email. Since as @tmz mentioned, the development need to change all of the links in the archives would be high risk and time-costly, for low-benefit, I suggest that we close this issue as "Can't Fix." @kevin @smooge @puiterwijk

For references - the references I found that were commented out;

fedoraproject.org/data/content/index.html:13: <!--               <div class="global-notice"><a href="http://lists.fedoraproject.org/pipermail/announce/2010-July/002843.html">${Markup(_('Not getting notified about software updates? &lt;em&gt;Read this notice &amp;rarr;&lt;/em&gt;'))}</a></div> -->
fedoraproject.org/mediawiki/Fedora.deps.php:6:// see http://mail.wikipedia.org/pipermail/wikitech-l/2006-January/033660.html
lib/python2.7/site-packages/setuptools-39.2.0.dist-info/METADATA:61:mailing list <http://mail.python.org/pipermail/distutils-sig/>`_.

Well, I was not advocating changing all the links, as I agree it would be very difficult.

I was just hoping we could change pages like:
https://lists.fedoraproject.org/pipermail/packaging/
to note that they were the old archive ( if you just look it looks like the list stopped accepting new posts in 2015). But I guess you all are right that no one would get there without going from some old link.

Metadata Update from @kevin:
- Issue close_status updated to: Will Not/Can Not fix
- Issue status updated to: Closed (was: Open)

6 years ago

Log in to comment on this ticket.

Metadata