#11682 id-fedoraproject-org-throws-a-504-gateway-timeout
Closed: Fixed with Explanation a year ago by zlopez. Opened a year ago by ilikelinux.

NOTE

If your issue is for security or deals with sensitive info please
mark it as private using the checkbox below.

Describe what you would like us to do:

I was trying to connect to the official Wiki yesterday. I made a request in discourse and I was told better do make a ticket here. Is there a way to check if the system hast troubles or just reporting would be the right thing ?

https://discussion.fedoraproject.org/t/id-fedoraproject-org-throws-a-504-gateway-timeout/99219
You might answer in discourse so we do have an answer there ... thanks.

When do you need this to be done by? (ASSAP)

ASSAP


Is this happening 100% of the time? or sporadic?

Is it:

  1. Go to wiki, click login button, get login page from id, enter info, click submit and get 504?

or

  1. Go to wiki, click login, get 504 from id right then?

or something else?

  1. Go to wiki, click login button, get login page from id, enter info, click submit and get 504?
    Yes,
    I wanted to make some changes on the lightdm wiki and was asked to login, I got redirected to id.fp.o and got the time out.

Gateway Timeout:
The gateway did not receive a timely response from the upstream server or application.
23:15:15 UTC
Thursday, December 14, 2023

Metadata Update from @phsmoura:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: Needs investigation, low-gain, low-trouble, ops

a year ago

Metadata Update from @abompard:
- Issue assigned to abompard

a year ago

It should be OK now. In the logs I had:

pam_sss(ipsilon:auth): received for user abompard: 4 (System error)

I had to restart sssd on ipsilon01, no idea why it wasn't responding to pam_sss, but now I can login.

Let me try that as well

Maybe it just needed some time, I was able to log now.

Closing the ticket

Metadata Update from @zlopez:
- Issue close_status updated to: Fixed with Explanation
- Issue status updated to: Closed (was: Open)

a year ago

Actually I'm unable to reach ipsilon02. It responds to ping but I can't SSH into it. That may by why there are still 502 errors sometimes, when haproxy sends to ipsilon02.

Metadata Update from @abompard:
- Issue status updated to: Open (was: Closed)

a year ago

Load seems to be around 1 since this morning. Could it be power-cycled maybe?

Metadata Update from @abompard:
- Assignee reset

a year ago

I can't SSH to it either, but if it would be offline we should get Nagios alert.

Oh yeah it does respond to ping. Looks like we don't check for SSH access in Nagios though: https://nagios.fedoraproject.org/nagios/cgi-bin//status.cgi?host=ipsilon02.iad2.fedoraproject.org

@darknao restarted the ipsilon02 from vmhost-x86-06 and this should resolve the issue.

Metadata Update from @zlopez:
- Issue close_status updated to: Fixed with Explanation
- Issue status updated to: Closed (was: Open)

a year ago

Hi, this is still happening. Whenever I try to login to discussions.fedoraproject.org it takes me several retries until I'm able to do it. It is sporadic, but most of the time it fails with 502. I just retry and eventually it gets through, but the general reliability problem described here seems to be still around - more so than not at least in my case.

I don't think this is "still happening" except this morning. We had a misbehaving proxy and auth server. Are you still seeing any issues?

@kevin This is still happening. It was happening yesterday and it is happening today as of this moment including - it took me 3x tries to even get to login to pagure.io to comment on this.

The first try was a 504 after a long timeout

screenshot-2024-11-12_08-14-30.png

Then I got an unexpected unauthorized

screenshot-2024-11-12_08-15-19.png

And the third retry let me in.

Do you have issue with anything else than pagure.io? The pagure.io is sometimes under heavy load, so it could be just a coincidence that you tried it at that time.

I didn't have issue with pagure.io once logged in, I only had issues with signing in via id.fedoraproject.org to both discussions.f.o and pagure.io - it seemed to be the same issue auth on id.fedoraproject.org timing out (504) even before redirecting to pagure or discussions in the middle of auth. See the 504 page screenshot showing id.fedoraproject.org URL not pagure.

But it's a hit and miss. I had that issue yesterday when I first commented, then I also had this issue today when I commented. I just logged into pagure to comment and no issues right now. It's intermittent failures here and there, so cannot reproduce it at will to investigate further.

@zlopez another failure just now when trying to log in to bugzilla.redhat.com - shall we reopen this ticket to investigate why is this so unreliable? Or should I create a new issue for this?

screenshot-2024-11-12_16-09-19.png

Or I'm fine if this is "intended" and there is no bandwidth to investigate it on your side - it seems it eventually works and only affects "Login by Fedora". I am not using the Fedora sites that often and if this is just intermittent I can retry a few tries and wait a bit if needed, not the end of the world.

No, this is not intended. We just are not seeing other reports or are able to reproduce ourseleves, so I'm not sure whats going on here.

I'll try digging some more.

Understood. Let me know if there is any debugging I can help with. Sometimes it does not fail just takes a very long time, like now from my mobile (but same network).

Understood. Let me know if there is any debugging I can help with. Sometimes it does not fail just takes a very long time, like now from my mobile (but same network).

Log in to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog