NOTE
If your issue is for security or deals with sensitive info please mark it as private using the checkbox below.
I was trying to connect to the official Wiki yesterday. I made a request in discourse and I was told better do make a ticket here. Is there a way to check if the system hast troubles or just reporting would be the right thing ?
https://discussion.fedoraproject.org/t/id-fedoraproject-org-throws-a-504-gateway-timeout/99219 You might answer in discourse so we do have an answer there ... thanks.
ASSAP
Is this happening 100% of the time? or sporadic?
Is it:
or
or something else?
Gateway Timeout: The gateway did not receive a timely response from the upstream server or application. 23:15:15 UTC Thursday, December 14, 2023
Metadata Update from @phsmoura: - Issue priority set to: Waiting on Assignee (was: Needs Review) - Issue tagged with: Needs investigation, low-gain, low-trouble, ops
Metadata Update from @abompard: - Issue assigned to abompard
I saw similar issue happening here https://pagure.io/fedora-infrastructure/issue/11671
Experienced the same when trying to login to https://communityblog.fedoraproject.org/
It should be OK now. In the logs I had:
pam_sss(ipsilon:auth): received for user abompard: 4 (System error)
I had to restart sssd on ipsilon01, no idea why it wasn't responding to pam_sss, but now I can login.
Let me try that as well
I'm still getting 504 when trying to login to https://communityblog.fedoraproject.org/
Maybe it just needed some time, I was able to log now.
Closing the ticket
Metadata Update from @zlopez: - Issue close_status updated to: Fixed with Explanation - Issue status updated to: Closed (was: Open)
Actually I'm unable to reach ipsilon02. It responds to ping but I can't SSH into it. That may by why there are still 502 errors sometimes, when haproxy sends to ipsilon02.
Metadata Update from @abompard: - Issue status updated to: Open (was: Closed)
Load seems to be around 1 since this morning. Could it be power-cycled maybe?
Metadata Update from @abompard: - Assignee reset
I can't SSH to it either, but if it would be offline we should get Nagios alert.
Oh yeah it does respond to ping. Looks like we don't check for SSH access in Nagios though: https://nagios.fedoraproject.org/nagios/cgi-bin//status.cgi?host=ipsilon02.iad2.fedoraproject.org
@darknao restarted the ipsilon02 from vmhost-x86-06 and this should resolve the issue.
Hi, this is still happening. Whenever I try to login to discussions.fedoraproject.org it takes me several retries until I'm able to do it. It is sporadic, but most of the time it fails with 502. I just retry and eventually it gets through, but the general reliability problem described here seems to be still around - more so than not at least in my case.
I don't think this is "still happening" except this morning. We had a misbehaving proxy and auth server. Are you still seeing any issues?
@kevin This is still happening. It was happening yesterday and it is happening today as of this moment including - it took me 3x tries to even get to login to pagure.io to comment on this.
The first try was a 504 after a long timeout
<img alt="screenshot-2024-11-12_08-14-30.png" src="/fedora-infrastructure/issue/raw/files/a345b497032c33a46ba2361f2d75f3eb30162c795a2a016739dc64d27ec1a4bb-screenshot-2024-11-12_08-14-30.png" />
Then I got an unexpected unauthorized
<img alt="screenshot-2024-11-12_08-15-19.png" src="/fedora-infrastructure/issue/raw/files/b2b901311a0063e90ed3737552d1a533241f01e87c43a3147913b76740fac958-screenshot-2024-11-12_08-15-19.png" />
And the third retry let me in.
Do you have issue with anything else than pagure.io? The pagure.io is sometimes under heavy load, so it could be just a coincidence that you tried it at that time.
I didn't have issue with pagure.io once logged in, I only had issues with signing in via id.fedoraproject.org to both discussions.f.o and pagure.io - it seemed to be the same issue auth on id.fedoraproject.org timing out (504) even before redirecting to pagure or discussions in the middle of auth. See the 504 page screenshot showing id.fedoraproject.org URL not pagure.
But it's a hit and miss. I had that issue yesterday when I first commented, then I also had this issue today when I commented. I just logged into pagure to comment and no issues right now. It's intermittent failures here and there, so cannot reproduce it at will to investigate further.
@zlopez another failure just now when trying to log in to bugzilla.redhat.com - shall we reopen this ticket to investigate why is this so unreliable? Or should I create a new issue for this?
<img alt="screenshot-2024-11-12_16-09-19.png" src="/fedora-infrastructure/issue/raw/files/c9d5e257f0dc8f0e2af9e6fab2de34655a69daaf787bf0cc8eb12f5510c19fe9-screenshot-2024-11-12_16-09-19.png" />
Or I'm fine if this is "intended" and there is no bandwidth to investigate it on your side - it seems it eventually works and only affects "Login by Fedora". I am not using the Fedora sites that often and if this is just intermittent I can retry a few tries and wait a bit if needed, not the end of the world.
No, this is not intended. We just are not seeing other reports or are able to reproduce ourseleves, so I'm not sure whats going on here.
I'll try digging some more.
Understood. Let me know if there is any debugging I can help with. Sometimes it does not fail just takes a very long time, like now from my mobile (but same network).
Log in to comment on this ticket.