Today I'm getting 504 - Gateway Timeout when trying to login to different Fedora services. I also noticed plenty of http-accounts errors in nagios.
asap
The URL that is getting the timeout is https://id.fedoraproject.org/login/pam
In sssd log on ipsilon01 I found plenty of:
ipsilon01
(2024-04-19 11:39:44): [pam] [cache_req_common_process_dp_reply] (0x3f7c0): [CID#5330] CR #36198: Could not get account info [1432158212]: SSSD is offline
But checking the sssd service it seems to be running.
Found this in httpd error_log on ipsilon01 when tracking the transaction_id:
transaction_id
[Fri Apr 19 12:07:35.906225 2024] [wsgi:error] [pid 2370894:tid 2371055] [client 10.3.163.74:45548] Timeout when reading response headers from daemon process 'ipsilon': /usr/libexec/ipsilon/ipsilon, referer: https://ipa01.iad2.fedoraproject.org/login/gssapi/negotiate?ipsilon_transaction_id=18c74230-a5f3-4b08-8fec-7fc3bc047ab3
Metadata Update from @zlopez: - Issue untagged with: high-gain - Issue tagged with: low-gain
Also seeing this repeatedly. It's not letting me login to Fedora services here, so this is a little urgent I guess.
Metadata Update from @zlopez: - Issue untagged with: low-gain - Issue tagged with: high-gain
There was update of access switches in IAD2 datacenter, which was finished at 13:18 UTC, but it seems that the issue persists. So this probably wasn't the root cause of this issue.
I also noticed that the login requests sometimes finishes correctly, but in most cases I'm getting 504.
This is causing my new-repository requests to be closed as invalid because the releng bot cannot find me in FAS: https://pagure.io/releng/fedora-scm-requests/issue/61861
Can anyone who was seeing this please check again now and see if anything like it is still happening?
If it is, please list the app you were trying to login to/auth against and time/date.
Can anyone who was seeing this please check again now and see if anything like it is still happening? If it is, please list the app you were trying to login to/auth against and time/date.
i can login to copr now using FAS.
I think everything is back to normal now.
I am still not fully sure what the cause was, it seemed to be several issues at once. The database sever for accounts was under heavy load, the ipa cluster seemed to be in a odd state.
Please report any further issues...
Metadata Update from @kevin: - Issue close_status updated to: Fixed with Explanation - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.