#11401 Badges website is very slow / unresponsive
Closed: Fixed 2 years ago by gui1ty. Opened 2 years ago by gui1ty.

I was trying to look something up regarding Badges. The website is loading very slow (>3 min). Sometimes it's not responding at all. I didn't get any error messages yet. So, not sure if one of the proxies or the site itself is misbehaving.

https://badges.fedoraproject.org/explore (Started loading before I wrote this. Still not loaded.)

I was connecting through proxy37. On proxy31, in a new tab it appeared to load somewhat faster. @darknao took a look, but for him everything worked well.

In particular looking up a user profile, https://badges.fedoraproject.org/user/naraiank, is very slow on proxy37.


Metadata Update from @phsmoura:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: Needs investigation, low-gain, low-trouble, ops

2 years ago

Can you take a look at it in the inspector?
(control-shift-i and look at the network tab)

Can you see some request(s) that take a long time?

What browser are you using? Do others show the same thing?

Can you take a look at it in the inspector?
(control-shift-i and look at the network tab)

Can you see some request(s) that take a long time?

See attachment. It seems some JavaScript file is not loading at all. It times out after more than 4 minutes. That's when the page finally loads. This attempt was using proxy30.

badges_load_failed.png

What browser are you using? Do others show the same thing?

Brave (v1.52.129) is my default browers. I also tried in Chromium (chromium-freeworld-111.0.5563.64). Same result.

I looked at the request headers for that JavaScript file:

badges_request_headers.png

Trying to open that URL directly results in:

beta.openbadges.org’s DNS address could not be found.

beta.openbadges.org does indeed not return either A or AAAA records. So, I guess we need to fix that in Badges frontend. Not sure though what that snippet is used for. The site appears to load and display fine without it, once the timeout is reached.

Any update? This issue is making it a pain to look up things on badges or otherwise use the web interface.

@gui1ty, you are prolly more acquainted with the existing badges system than me so tell me - Is the JS file that is mentioned here, used https://backpack.openbadges.org/issuer.js? If so, we'd want to locally store it in the repository (and update it regularly) than having it query a third-party server, which might be causing delays.

@gui1ty, you are prolly more acquainted with the existing badges system than me so tell me - Is the JS file that is mentioned here, used https://backpack.openbadges.org/issuer.js?

As far as I can tell, it's not required. At least once the page loads, after the request times out eventually, the page looks normal to me. So, I'd suggest removing that request. If it turns out we need it after all, we can hunt it down and capture it in the repository.

Does it happen in firefox also?

openbadges.org was the thing done by mozilla I think, we wanted to make our badges portable so they could be exported there if you wanted. It's not been around in a long time I don't think.

I can't seem to get it to load slowly here. ;(

If you hardcode say proxy10 (ie, just put:

38.145.60.21 badges.fedoraproject.org

in /etc/hosts. Does that load quickly/normally?

Does it happen in firefox also?

No. To my surprise it doesn't. The network console has no mention of the issuer.js at all.

In Brave it only happens when I open user profile or badge pages. The front page doesn't seem to request it and loads as quickly as usual.

If you hardcode say proxy10 (ie, just put:

38.145.60.21 badges.fedoraproject.org

in /etc/hosts. Does that load quickly/normally?

Well, since the request is from my browser to the external host, it bypasses the Fedora proxies, doesn't it?

I tried it anyway and Brave still loads profile and badge pages slowly waiting on a timeout of the connection to https://beta.openbadges.org/issuer.js

So, I tried 127.0.0.1 beta.openbadges.org in /etc/hosts. That made the request fail in less than a second and pages load quickly now in Brave.

Long story short: I don't know why the issuer.js is requested in Brave, but not in Firefox, but removing it from the Tahrir code will definitely solve the problem.

Strange. ok, shall we just close this now? Or is there anything more we can do here?

There's probably nothing to do for infra. Let me discuss with @t0xic0der tomorrow, what the best way forward is. I found a workaround for myself. But this issue could easily affect others, who may just think the server is slow or not responding without reporting it.

Metadata Update from @t0xic0der:
- Issue assigned to t0xic0der

2 years ago

@kevin, I have assigned this ticket to myself. Let this ticket stay open for now until we are able to fix it - then we can close it as completed.

@gui1ty, you are prolly more acquainted with the existing badges system than me so tell me - Is the JS file that is mentioned here, used https://backpack.openbadges.org/issuer.js?

I just realized that this is a different URL than the one that's causing (me) trouble. https://backpack.openbadges.org/issuer.js is still valid and works. Not sure where https://beta.openbadges.org/issuer.js comes from. It's not in the Tahrir code.

I looked into it again and requesting the same page on STG doesn't show a request for issuer.js at all, neither from beta nor from backpack. This makes my head spin...

The issuer.js call is defined here:
https://github.com/fedora-infra/tahrir/blob/0.9.2/tahrir/templates/master.mak#L31

The % if logged_in tells me you must be logged first to see this request.
The python-tahrir package is 5 years old (as the last 0.9.2 release) and most likely doesn't have the changes from the develop branch.
You'll need a new release from upstream first, and a package rebuilt to get this fixed.

The issuer.js call is defined here:
https://github.com/fedora-infra/tahrir/blob/0.9.2/tahrir/templates/master.mak#L31

This is not the broken JS URL. If you look at my screenshots above, you'll see that the request is made to beta not backpack. What you are pointing at is the same @t0xic0der pointed to.

@darknao Thanks for being my rubber duck! 🦆

The broken URL is in 0.8.2, which is what we have deployed atm. I'll try a bugfix release.

I provided a PR: https://github.com/fedora-infra/tahrir/pull/464

Should be a simple enough fix. Of course, next comes the question of: "To release or not to release?". python-tahrir package could probably pull from GitHub directly.

The issuer.js call is defined here:
https://github.com/fedora-infra/tahrir/blob/0.9.2/tahrir/templates/master.mak#L31

This is not the broken JS URL. If you look at my screenshots above, you'll see that the request is made to beta not backpack. What you are pointing at is the same @t0xic0der pointed to.

I do see the beta URL on this link though:

<script src="//beta.openbadges.org/issuer.js"></script>

Or did I misunderstand what you're looking for?

And that URL was fixed after 0.9.2 here:
https://github.com/fedora-infra/tahrir/commit/769325b10b5cef6da81ff680f9072862eedcd7d3

It seems I have been busy with to many different things at the same time, tainting my capability in noticing small differences. Let's try to clear things up.

You are right, version 0.9.2 still contains the oudated URL. I was grepping in the development branch and that didn't have it. Then I looked at dist-git and saw that the stable version for el7, which is what is running on badges-web01, is 0.8.2 (version 0.9.0 is in testing). So, I assumed that 0.8.2 was running in production. That's the version I went looking for in GitHub. I saw it still had the oudated link and made a patch for that version.

I just looked on the machine itself and both prod and stg are running version 0.9.2. I have got no idea how that version ended up on the servers. But I guess there's some reason why dist-git is behind on what we have on the servers. @kevin might know.

Since we are already on version 0.9.2 in production, it's probably best to close my PR for 0.8.2 and open a new one for 0.9.2. We could also release from what is currently in the develop branch. But that requires looking through the changes carefully. We don't want to introduce any new bugs or regressions into an already brittle system.

Applying a oneliner hotfix to 0.9.2 has my preference.

We have dedicated fedora-infra repositories based on koji *-infra tags like, in this case, epel7-infra.
python-tahrir 0.9.2 was built against that tag here. This is why you'll find that version deployed on badges without it being available on the public epel7 repo.

Also, it may be possible to add that patch directly in the spec file and rebuild while waiting for a proper upstream release, if that ever happens.

Also, it may be possible to add that patch directly in the spec file and rebuild while waiting for a proper upstream release, if that ever happens.

That sounds like a good idea. :thumbsup:

Since badges is currently being rewritten from the ground up, I doubt very much that there will be another meaningful release. I do have access to the GitHub repo, but I'm not sure how the release process works and who I'd need to poke to push a new release to PyPI.

So, if that patch could be applied directly in the spec file, I think that's the approach with the lowest risk and the least amount of work.

@gui1ty Badges has just been updated with the mentioned patch.
Could you try again and see if you're still experiencing slow load times?

I removed my workaround and pages appear to be loading quickly now. The issuer.js file was loaded in 160ms. No longer failing. Looks good. Thank you @darknao!

Metadata Update from @gui1ty:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

2 years ago

Metadata Update from @t0xic0der:
- Issue assigned to darknao (was: t0xic0der)

2 years ago

Log in to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog
Attachments 2