Fedora CoreOS zinacati started to behave very flaky.
It reports 503 errors like:
Jun 05 10:39:27 rs185 zincati[4105]: error: While pulling 72cf2f80ba1496d478e110d03e1199d9d21382840e96ffeddf4303eb040fbb55: Server returned HTTP 503
when digging deeper it's coming from rpm-ostree
Jun 05 10:39:21 rs185 rpm-ostree[3686]: libostree HTTP error from remote fedora for <https://d2uk5hbyrobdzx.cloudfront.net/objects/41/5200e439bfc968bb9a26b74e29b0bc7c87c119db9f9749a25172c82afee61a.filez>: Server returned HTTP 503 Jun 05 10:39:23 rs185 rpm-ostree[3686]: libostree HTTP error from remote fedora for <https://d2uk5hbyrobdzx.cloudfront.net/objects/99/601d82918ce51d6267d837cdaba5df5c5448cecf08cd0de61d601a7d27120a.filez>: Server returned HTTP 503 Jun 05 10:39:25 rs185 rpm-ostree[3686]: libostree HTTP error from remote fedora for <https://d2uk5hbyrobdzx.cloudfront.net/objects/db/e3e23f2e97dfac8f3a518e66249d58dea1dd3e102cdfeeb4d34367ec9e1020.filez>: Server returned HTTP 503 Jun 05 10:39:25 rs185 rpm-ostree[3686]: libostree HTTP error from remote fedora for <https://d2uk5hbyrobdzx.cloudfront.net/objects/56/dff8ea8710640a29f53b8de39f8bc11b0ef6559685601b2f8f5679d676f8d0.filez>: Server returned HTTP 503 Jun 05 10:39:27 rs185 rpm-ostree[3686]: libostree HTTP error from remote fedora for <https://d2uk5hbyrobdzx.cloudfront.net/objects/13/06f20ecd9aa85efc08c51ffdc7ee51fdfd2242cdc5aa24b11dd1fd171d3975.filez>: Server returned HTTP 503 Jun 05 10:39:27 rs185 rpm-ostree[3686]: Txn Deploy on /org/projectatomic/rpmostree1/fedora_coreos failed: While pulling 72cf2f80ba1496d478e110d03e1199d9d21382840e96ffeddf4303eb040fbb55: Server returned HTTP 503
when using curl for those urls it works:
curl -v https://d2uk5hbyrobdzx.cloudfront.net/objects/13/06f20ecd9aa85efc08c51ffdc7ee51fdfd2242cdc5aa24b11dd1fd171d3975.filez * Host d2uk5hbyrobdzx.cloudfront.net:443 was resolved. * IPv6: (none) * IPv4: 18.244.20.208, 18.244.20.19, 18.244.20.87, 18.244.20.212 * Trying 18.244.20.208:443... * Connected to d2uk5hbyrobdzx.cloudfront.net (18.244.20.208) port 443 * ALPN: curl offers h2,http/1.1 * TLSv1.3 (OUT), TLS handshake, Client hello (1): * CAfile: /etc/pki/tls/certs/ca-bundle.crt * CApath: none * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8): * TLSv1.3 (IN), TLS handshake, Certificate (11): * TLSv1.3 (IN), TLS handshake, CERT verify (15): * TLSv1.3 (IN), TLS handshake, Finished (20): * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.3 (OUT), TLS handshake, Finished (20): * SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / x25519 / RSASSA-PSS * ALPN: server accepted h2 * Server certificate: * subject: CN=*.cloudfront.net * start date: Oct 10 00:00:00 2023 GMT * expire date: Sep 19 23:59:59 2024 GMT * subjectAltName: host "d2uk5hbyrobdzx.cloudfront.net" matched cert's "*.cloudfront.net" * issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M01 * SSL certificate verify ok. * Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption * Certificate level 1: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption * Certificate level 2: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption * using HTTP/2 * [HTTP/2] [1] OPENED stream for https://d2uk5hbyrobdzx.cloudfront.net/objects/13/06f20ecd9aa85efc08c51ffdc7ee51fdfd2242cdc5aa24b11dd1fd171d3975.filez * [HTTP/2] [1] [:method: GET] * [HTTP/2] [1] [:scheme: https] * [HTTP/2] [1] [:authority: d2uk5hbyrobdzx.cloudfront.net] * [HTTP/2] [1] [:path: /objects/13/06f20ecd9aa85efc08c51ffdc7ee51fdfd2242cdc5aa24b11dd1fd171d3975.filez] * [HTTP/2] [1] [user-agent: curl/8.6.0] * [HTTP/2] [1] [accept: */*] > GET /objects/13/06f20ecd9aa85efc08c51ffdc7ee51fdfd2242cdc5aa24b11dd1fd171d3975.filez HTTP/2 > Host: d2uk5hbyrobdzx.cloudfront.net > User-Agent: curl/8.6.0 > Accept: */* > * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): < HTTP/2 200 < content-length: 15160 < server: Apache < strict-transport-security: max-age=31536000; includeSubDomains; preload < x-frame-options: SAMEORIGIN < x-xss-protection: 1; mode=block < x-content-type-options: nosniff < referrer-policy: same-origin < strict-transport-security: max-age=31536000; includeSubDomains; preload < last-modified: Mon, 29 Apr 2024 04:50:48 GMT < apptime: D=3711 < x-fedora-appserver: kojipkgs02.iad2.fedoraproject.org < x-varnish: 299826652 < via: 1.1 kojipkgs02.iad2.fedoraproject.org (Varnish/7.3), 1.1 93f1c701362eb59a676baaac7ea81bd8.cloudfront.net (CloudFront) < accept-ranges: bytes < x-fedora-proxyserver: proxy01.iad2.fedoraproject.org < x-fedora-requestid: ZlqsG5WluXd4lnAadIKB5gAAA44 < date: Thu, 06 Jun 2024 05:08:17 GMT < x-cache: Hit from cloudfront < x-amz-cf-pop: FRA56-P11 < x-amz-cf-id: bzydx-VPNTc2ypEluova1rfZ8VJVAbXq28_nBilZFIe3haM2hmT6lA== < age: 11394 < Warning: Binary output can mess up your terminal. Use "--output -" to tell Warning: curl to output it to your terminal anyway, or consider "--output Warning: <FILE>" to save to a file. * Failure writing output to destination * Connection #0 to host d2uk5hbyrobdzx.cloudfront.net left intact
I find it interesting that rpm-ostree complains about 503 for different URL on each run. Is this by design? Otherwise this might be difficult for the CDN.
~ 2 month?
Work around: restart zincati in a loop, it will work every other time:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
#!/bin/bash # Function to restart the zincati service restart_service() { echo "HTTP 503 error detected. Restarting zincati.service..." systemctl restart zincati.service if [ $? -eq 0 ]; then echo "zincati.service restarted successfully." else echo "Failed to restart zincati.service." fi } # Monitor the journal logs for zincati.service journalctl -fu zincati.service | while read -r line; do if echo "$line" | grep -q "Server returned HTTP 503"; then restart_service fi done
Metadata Update from @zlopez: - Issue priority set to: Waiting on Assignee (was: Needs Review) - Issue tagged with: Needs investigation
What version of rpm-ostree do you have there?
fcos: 40.20240504.3.0 and 40.20240416.3.1 this is rpm-ostree 2024.5
ok, in ostree 2024.6 there's a fix to retry on 5xx errors (this was intended, but it wasn't properly doing so). I don't know if there's any way for you to test 2024.6?
Also, I looked at cloudfront stats. There's been a pretty constant 0.02% error rate. I am not sure why it's hitting you so often, but it's a really small percentage and it's seemingly always been there. The change we made to haproxy a while back didn't seem to change this any.
@stkoelle so let's see if the problem persists once you get onto a version of FCOS with rpm-ostree 2024.6 ?
I think there is another underlying problem/bug with download of rpm-ostree. I will monitor further to pinpoint it better. I have a success rate of around 10%.
Metadata Update from @zlopez: - Issue priority set to: Waiting on Reporter (was: Waiting on Assignee)
Still I get many 503 just right now: Jun 20 09:21:33 rs184 zincati[6632]: [ERROR zincati::update_agent::actor] failed to stage deployment: rpm-ostree deploy failed: Jun 20 09:21:33 rs184 zincati[6632]: error: While pulling a65ed051ae3c7ae658f19bee19ff36be19723070282305382890a793904f6f5e: Server returned HTTP 503
The cloudfront stat seems not to cover that somehow.
Yeah, it's very odd. Are you still seeing it now?
Can you do a traceroute / ping to the CF endpoint and see if it's something between you and it?
We are still have a lot of them, will do the traces.
<img alt="2024-07-24_17-14.png" src="/fedora-infrastructure/issue/raw/files/6e2060f1ded99782f140cbfe99327063eac898e015b39595336280e4a6fe6d92-2024-07-24_17-14.png" />
Any news here?
For now the problem is gone.
Metadata Update from @stkoelle: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Problem is still here:
rned HTTP 503 Sep 12 12:11:17 bitbucketrunner1 rpm-ostree[1119979]: libostree HTTP error from remote fedora for <https://d2uk5hbyrobdzx.cloudfront.net/objects/12/ad2630f0905778084a499e5cc7a0d11db16484779374599cb5d9371578fe27.dirtree>: Server returned HTTP 503 Sep 12 12:11:17 bitbucketrunner1 rpm-ostree[1119979]: Txn Deploy on /org/projectatomic/rpmostree1/fedora_coreos failed: While pulling 759c112d3a3d1f762ba368106411fcc4edf8b1c39323aca99269741c88e6a597: Server returned HTTP 503
I am confused by the tracepath, ping looks ok:
tracepath d2uk5hbyrobdzx.cloudfront.net 1?: [LOCALHOST] pmtu 1500 1: ??? 0.389ms 1: ??? 0.400ms 2: core22.fsn1.hetzner.com 1.108ms 3: hos-tr4.ex3k9.dc4.fsn1.hetzner.com 4.888ms 4: amazon.hetzner.com 4.933ms asymm 5 5: no reply 6: no reply 7: no reply 8: no reply 9: no reply 10: no reply ... 25: no reply 26: no reply 27: no reply 28: no reply 29: no reply 30: no reply Too many hops: pmtu 1500 Resume: pmtu 1500 root@bitbucketrunner1 /var/home/koelle # ping d2uk5hbyrobdzx.cloudfront.net PING d2uk5hbyrobdzx.cloudfront.net (18.244.20.19) 56(84) bytes of data. 64 bytes from server-18-244-20-19.fra56.r.cloudfront.net (18.244.20.19): icmp_seq=1 ttl=250 time=5.22 ms 64 bytes from server-18-244-20-19.fra56.r.cloudfront.net (18.244.20.19): icmp_seq=2 ttl=250 time=5.17 ms --- d2uk5hbyrobdzx.cloudfront.net ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 5.173/5.197/5.222/0.024 ms root@bitbucketrunner1 /var/home/koelle # tracepath server-18-244-20-19.fra56.r.cloudfront.net 1?: [LOCALHOST] pmtu 1500 1: ??? 0.485ms 1: ??? 0.368ms 2: core24.fsn1.hetzner.com 27.966ms 3: core1.fra.hetzner.com 5.006ms 4: 213-133-113-126.clients.your-server.de 5.946ms asymm 5 5: no reply 6: no reply 7: no reply 8: no reply 9: no reply ... 15: no reply ^C
Maybe it's a issue with their peering?
Metadata Update from @stkoelle: - Issue status updated to: Open (was: Closed)
The other day this was on hack news: https://news.ycombinator.com/item?id=41585249 "Cloudflare misidentifies Hetzner IPs as being located in Iran" it cloudflare, and you are using cloudfront, mut maybe an idea?
Circling back here. Still happening?
Perhaps there's a way to tell it to pull directly from our master mirrors so we can confirm the people somwhow is in cloudfront?
Please reopen if you still are seeing this...
Metadata Update from @kevin: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
I just want to update, that I see the issue: This time, curl does not get the data either:
Feb 27 10:32:50 cdnode4 zincati[377245]: [INFO zincati::update_agent::actor] deployment 40.20241019.3.0 (6df70065620571076f242857b9080d747891e2279dff3ed1756270f6889731ce) will be excluded from being a future update target Feb 27 10:32:50 cdnode4 zincati[377245]: [INFO zincati::update_agent::actor] initialization complete, auto-updates logic enabled Feb 27 10:32:50 cdnode4 zincati[377245]: [INFO zincati::strategy] update strategy: immediate Feb 27 10:32:50 cdnode4 zincati[377245]: [INFO zincati::update_agent::actor] reached steady state, periodically polling for updates Feb 27 10:32:50 cdnode4 systemd[1]: Started zincati.service - Zincati Update Agent. Feb 27 10:32:50 cdnode4 zincati[377245]: [INFO zincati::cincinnati] current release detected as not a dead-end Feb 27 10:32:50 cdnode4 zincati[377245]: [INFO zincati::update_agent::actor] target release '41.20250130.3.0' selected, proceeding to stage it Feb 27 10:33:15 cdnode4 zincati[377245]: [ERROR zincati::update_agent::actor] failed to stage deployment: rpm-ostree deploy failed: Feb 27 10:33:15 cdnode4 zincati[377245]: error: While pulling d881f2c35bdd4e90960a026b7a161bed82a660da71f484c97f0655776ed9307a: While fetching https://d2uk5hbyrobdzx.cloudfront.net/objects/c7/76fa073f490b9b0d48dc7668289c669d9e8977f89929cc6604173e996c2379.filez: Server returned HTTP 502
...restart zincat ... => new http error code
Feb 27 10:38:58 cdnode4 zincati[378386]: retrying in 4s Feb 27 10:39:02 cdnode4 zincati[378386]: [INFO zincati::update_agent::actor] found 1 other finalized deployment Feb 27 10:39:02 cdnode4 zincati[378386]: [INFO zincati::update_agent::actor] deployment 40.20241019.3.0 (6df70065620571076f242857b9080d747891e2279dff3ed1756270f6889731ce) will be excluded from being a future update target Feb 27 10:39:02 cdnode4 zincati[378386]: [INFO zincati::update_agent::actor] initialization complete, auto-updates logic enabled Feb 27 10:39:02 cdnode4 zincati[378386]: [INFO zincati::strategy] update strategy: immediate Feb 27 10:39:02 cdnode4 zincati[378386]: [INFO zincati::update_agent::actor] reached steady state, periodically polling for updates Feb 27 10:39:02 cdnode4 systemd[1]: Started zincati.service - Zincati Update Agent. Feb 27 10:39:02 cdnode4 zincati[378386]: [INFO zincati::cincinnati] current release detected as not a dead-end Feb 27 10:39:03 cdnode4 zincati[378386]: [INFO zincati::update_agent::actor] target release '41.20250130.3.0' selected, proceeding to stage it Feb 27 10:39:15 cdnode4 zincati[378386]: [ERROR zincati::update_agent::actor] failed to stage deployment: rpm-ostree deploy failed: Feb 27 10:39:15 cdnode4 zincati[378386]: error: While pulling d881f2c35bdd4e90960a026b7a161bed82a660da71f484c97f0655776ed9307a: While fetching https://d2uk5hbyrobdzx.cloudfront.net/objects/a5/c6939921ab3ce95a467ec7db8aa36781748c634dda443bb5b840ad77a584f3.filez: Server returned HTTP 504 Feb 27 10:39:15 cdnode4 zincati[378386]
curl "https://d2uk5hbyrobdzx.cloudfront.net/delta-indexes/2I/Hyw1vdTpCWCgJrehYb7YKmYNpx9ITJfwZVd27ZMHo.index" <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>404 Not Found</title> </head><body> <h1>Not Found</h1> <p>The requested URL was not found on this server.</p> </body></html>
curl "https://d2uk5hbyrobdzx.cloudfront.net/delta-indexes/2I/Hyw1vdTpCWCgJrehYb7YKmYNpx9ITJfwZVd27ZMHo.index" <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <HTML><HEAD><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> <TITLE>ERROR: The request could not be satisfied</TITLE> </HEAD><BODY> <H1>502 ERROR</H1> <H2>The request could not be satisfied.</H2> <HR noshade size="1px"> CloudFront attempted to establish a connection with the origin, but either the attempt failed or the origin closed the connection. We can't connect to the server for this app or website at this time. There might be too much traffic or a configuration error. Try again later, or contact the app or website owner. <BR clear="all"> If you provide content to customers through CloudFront, you can find steps to troubleshoot and help prevent this error by reviewing the CloudFront documentation. <BR clear="all"> <HR noshade size="1px"> <PRE> Generated by cloudfront (CloudFront) Request ID: Z0NRLb7xFL2PnD4oeew3drp6C9d1hUbRZdn8n44sJTqjBnG5Iwr8_w== </PRE> <ADDRESS> </ADDRESS> </BODY><
some more strange looking logs:
[ERROR zincati::update_agent::actor] failed to stage deployment: rpm-ostree deploy failed: Feb 27 12:07:13 cdnode4 zincati[999597]: error: While pulling d881f2c35bdd4e90960a026b7a161bed82a660da71f484c97f0655776ed9307a: While fetching https://d2uk5hbyrobdzx.cloudfront.net/objects/bc/6cad566536d8e6d97adb35974415e9aba465002e6075a7a8d47a23fdf0d801.filez: Server returned HTTP 502 Feb 27 12:07:13 cdnode4 zincati[999597]: ^C root@cdnode4 /var/home/koelle # journalctl -fu zincati.service^C root@cdnode4 /var/home/koelle # curl -Iv "https://d2uk5hbyrobdzx.cloudfront.net/objects/df/83130da5f9c65165249b100b4e5197e0817ccca26bd4b3659a311ae0743062.filez" * Host d2uk5hbyrobdzx.cloudfront.net:443 was resolved. * IPv6: (none) * IPv4: 18.244.20.87, 18.244.20.212, 18.244.20.208, 18.244.20.19 * Trying 18.244.20.87:443... * Connected to d2uk5hbyrobdzx.cloudfront.net (18.244.20.87) port 443 * ALPN: curl offers h2,http/1.1 * TLSv1.3 (OUT), TLS handshake, Client hello (1): * CAfile: /etc/pki/tls/certs/ca-bundle.crt * CApath: none * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8): * TLSv1.3 (IN), TLS handshake, Certificate (11): * TLSv1.3 (IN), TLS handshake, CERT verify (15): * TLSv1.3 (IN), TLS handshake, Finished (20): * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.3 (OUT), TLS handshake, Finished (20): * SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / x25519 / RSASSA-PSS * ALPN: server accepted h2 * Server certificate: * subject: CN=*.cloudfront.net * start date: Jul 30 00:00:00 2024 GMT * expire date: Jul 3 23:59:59 2025 GMT * subjectAltName: host "d2uk5hbyrobdzx.cloudfront.net" matched cert's "*.cloudfront.net" * issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M01 * SSL certificate verify ok. * Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption * Certificate level 1: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption * Certificate level 2: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption * using HTTP/2 * [HTTP/2] [1] OPENED stream for https://d2uk5hbyrobdzx.cloudfront.net/objects/df/83130da5f9c65165249b100b4e5197e0817ccca26bd4b3659a311ae0743062.filez * [HTTP/2] [1] [:method: HEAD] * [HTTP/2] [1] [:scheme: https] * [HTTP/2] [1] [:authority: d2uk5hbyrobdzx.cloudfront.net] * [HTTP/2] [1] [:path: /objects/df/83130da5f9c65165249b100b4e5197e0817ccca26bd4b3659a311ae0743062.filez] * [HTTP/2] [1] [user-agent: curl/8.9.1] * [HTTP/2] [1] [accept: */*] > HEAD /objects/df/83130da5f9c65165249b100b4e5197e0817ccca26bd4b3659a311ae0743062.filez HTTP/2 > Host: d2uk5hbyrobdzx.cloudfront.net > User-Agent: curl/8.9.1 > Accept: */* > * Request completely sent off * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): < HTTP/2 200 HTTP/2 200 < content-length: 2948 content-length: 2948 < date: Thu, 27 Feb 2025 12:07:32 GMT date: Thu, 27 Feb 2025 12:07:32 GMT < server: Apache server: Apache < strict-transport-security: max-age=31536000; includeSubDomains; preload strict-transport-security: max-age=31536000; includeSubDomains; preload < x-frame-options: SAMEORIGIN x-frame-options: SAMEORIGIN < x-xss-protection: 1; mode=block x-xss-protection: 1; mode=block < x-content-type-options: nosniff x-content-type-options: nosniff < referrer-policy: same-origin referrer-policy: same-origin < strict-transport-security: max-age=31536000; includeSubDomains; preload strict-transport-security: max-age=31536000; includeSubDomains; preload < last-modified: Sun, 19 Jan 2025 02:25:07 GMT last-modified: Sun, 19 Jan 2025 02:25:07 GMT < apptime: D=3073 apptime: D=3073 < x-fedora-appserver: kojipkgs01.iad2.fedoraproject.org x-fedora-appserver: kojipkgs01.iad2.fedoraproject.org < x-varnish: 13609567 x-varnish: 13609567 < via: 1.1 kojipkgs01.iad2.fedoraproject.org (Varnish/7.5), 1.1 6571e9f709b2287f8a30275c17d07140.cloudfront.net (CloudFront) via: 1.1 kojipkgs01.iad2.fedoraproject.org (Varnish/7.5), 1.1 6571e9f709b2287f8a30275c17d07140.cloudfront.net (CloudFront) < accept-ranges: bytes accept-ranges: bytes < x-fedora-proxyserver: proxy10.iad2.fedoraproject.org x-fedora-proxyserver: proxy10.iad2.fedoraproject.org < x-fedora-requestid: Z8BVhJYPmPT55Ej6IRC3AAAAAkg x-fedora-requestid: Z8BVhJYPmPT55Ej6IRC3AAAAAkg < x-cache: Miss from cloudfront x-cache: Miss from cloudfront < x-amz-cf-pop: FRA56-P11 x-amz-cf-pop: FRA56-P11 < x-amz-cf-id: X9102u23mS24MoqmUqrcJknKtg3hbd-xBhgfCCSowk2pzwpd_88dsA== x-amz-cf-id: X9102u23mS24MoqmUqrcJknKtg3hbd-xBhgfCCSowk2pzwpd_88dsA== < age: 0 age: 0 <
after curl somewhat successfully fetched the file, zincati could fetch it also. What is different? UA, http version?
Here is a 504 with curl:
root@cdnode4 /var/home/koelle # curl -Iv "https://d2uk5hbyrobdzx.cloudfront.net/delta-indexes/2I/Hyw1vdTpCWCgJrehYb7YKmYNpx9ITJfwZVd27ZMHo.index" * Host d2uk5hbyrobdzx.cloudfront.net:443 was resolved. * IPv6: (none) * IPv4: 18.244.20.87, 18.244.20.208, 18.244.20.212, 18.244.20.19 * Trying 18.244.20.87:443... * Connected to d2uk5hbyrobdzx.cloudfront.net (18.244.20.87) port 443 * ALPN: curl offers h2,http/1.1 * TLSv1.3 (OUT), TLS handshake, Client hello (1): * CAfile: /etc/pki/tls/certs/ca-bundle.crt * CApath: none * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8): * TLSv1.3 (IN), TLS handshake, Certificate (11): * TLSv1.3 (IN), TLS handshake, CERT verify (15): * TLSv1.3 (IN), TLS handshake, Finished (20): * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.3 (OUT), TLS handshake, Finished (20): * SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / x25519 / RSASSA-PSS * ALPN: server accepted h2 * Server certificate: * subject: CN=*.cloudfront.net * start date: Jul 30 00:00:00 2024 GMT * expire date: Jul 3 23:59:59 2025 GMT * subjectAltName: host "d2uk5hbyrobdzx.cloudfront.net" matched cert's "*.cloudfront.net" * issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M01 * SSL certificate verify ok. * Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption * Certificate level 1: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption * Certificate level 2: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption * using HTTP/2 * [HTTP/2] [1] OPENED stream for https://d2uk5hbyrobdzx.cloudfront.net/delta-indexes/2I/Hyw1vdTpCWCgJrehYb7YKmYNpx9ITJfwZVd27ZMHo.index * [HTTP/2] [1] [:method: HEAD] * [HTTP/2] [1] [:scheme: https] * [HTTP/2] [1] [:authority: d2uk5hbyrobdzx.cloudfront.net] * [HTTP/2] [1] [:path: /delta-indexes/2I/Hyw1vdTpCWCgJrehYb7YKmYNpx9ITJfwZVd27ZMHo.index] * [HTTP/2] [1] [user-agent: curl/8.9.1] * [HTTP/2] [1] [accept: */*] > HEAD /delta-indexes/2I/Hyw1vdTpCWCgJrehYb7YKmYNpx9ITJfwZVd27ZMHo.index HTTP/2 > Host: d2uk5hbyrobdzx.cloudfront.net > User-Agent: curl/8.9.1 > Accept: */* > * Request completely sent off * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): < HTTP/2 504 HTTP/2 504 < content-type: text/html content-type: text/html < content-length: 1033 content-length: 1033 < server: CloudFront server: CloudFront < date: Thu, 27 Feb 2025 12:14:49 GMT date: Thu, 27 Feb 2025 12:14:49 GMT < x-cache: Error from cloudfront x-cache: Error from cloudfront < via: 1.1 5034084c037ff19008ba7c2c0b849a4c.cloudfront.net (CloudFront) via: 1.1 5034084c037ff19008ba7c2c0b849a4c.cloudfront.net (CloudFront) < x-amz-cf-pop: FRA56-P11 x-amz-cf-pop: FRA56-P11 < x-amz-cf-id: YKxC0sRdNHgNrJXY5GMwUzitgTcsFxtKvKIPVgPm-_jOVZgfycDhZQ== x-amz-cf-id: YKxC0sRdNHgNrJXY5GMwUzitgTcsFxtKvKIPVgPm-_jOVZgfycDhZQ== < * Connection #0 to host d2uk5hbyrobdzx.cloudfront.net left intact
varnish says:
< via: 1.1 kojipkgs02.iad2.fedoraproject.org (Varnish/7.5), 1.1 74ca1b9f17cb4adcfc54f8b84ccc7d82.cloudfront.net (CloudFront)
I don't see a DNS record for: kojipkgs02.iad2.fedoraproject.org
Yes, thats an internal cache server...
Some folks opened a new issue a bit ago that might be related here: https://pagure.io/fedora-infrastructure/issue/12427
But I think this is all due to https://pagure.io/fedora-infrastructure/issue/12419 which was fixed last night.
Are you still seeing any issues? The cache instablity definitely would have affected things duing that window.
I still see 500 errors:
Feb 28 07:40:25 clickhouse1 zincati[13411]: [ERROR zincati::update_agent::actor] failed to stage deployment: rpm-ostree deploy failed: Feb 28 07:40:25 clickhouse1 zincati[13411]: error: While pulling 6df70065620571076f242857b9080d747891e2279dff3ed1756270f6889731ce: While fetching https://d2uk5hbyrobdzx.cloudfront.net/objects/b7/1383f510fa9b0044cfea9e7c684d6f2bcb8f53f446abd0dd8f339203c711bc.filez: Server returned HTTP 502
One more server log of zincati + curling the url
eb 28 07:45:29 clickhouse1 zincati[14785]: [INFO zincati::update_agent::actor] target release '40.20241019.3.0' selected, proceeding to stage it Feb 28 07:45:29 clickhouse1 zincati[14785]: [ERROR zincati::update_agent::actor] failed to stage deployment: rpm-ostree deploy failed: Feb 28 07:45:29 clickhouse1 zincati[14785]: error: While pulling 6df70065620571076f242857b9080d747891e2279dff3ed1756270f6889731ce: While fetching https://d2uk5hbyrobdzx.cloudfront.net/delta-indexes/bf/cAZWIFcQdvJChXuQgNdHiR4ied_z7RdWJw9oiXMc4.index: Server returned HTTP 502 Feb 28 07:45:29 clickhouse1 zincati[14785]: root@clickhouse1 /home/wildfly/epoq-podman # curl -Iv "https://d2uk5hbyrobdzx.cloudfront.net/delta-indexes/bf/cAZWIFcQdvJChXuQgNdHiR4ied_z7RdWJw9oiXMc4.index" * Host d2uk5hbyrobdzx.cloudfront.net:443 was resolved. * IPv6: (none) * IPv4: 18.244.20.212, 18.244.20.19, 18.244.20.87, 18.244.20.208 * Trying 18.244.20.212:443... * Connected to d2uk5hbyrobdzx.cloudfront.net (18.244.20.212) port 443 * ALPN: curl offers h2,http/1.1 * TLSv1.3 (OUT), TLS handshake, Client hello (1): * CAfile: /etc/pki/tls/certs/ca-bundle.crt * CApath: none * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8): * TLSv1.3 (IN), TLS handshake, Certificate (11): * TLSv1.3 (IN), TLS handshake, CERT verify (15): * TLSv1.3 (IN), TLS handshake, Finished (20): * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.3 (OUT), TLS handshake, Finished (20): * SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / x25519 / RSASSA-PSS * ALPN: server accepted h2 * Server certificate: * subject: CN=*.cloudfront.net * start date: Jul 30 00:00:00 2024 GMT * expire date: Jul 3 23:59:59 2025 GMT * subjectAltName: host "d2uk5hbyrobdzx.cloudfront.net" matched cert's "*.cloudfront.net" * issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M01 * SSL certificate verify ok. * Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption * Certificate level 1: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption * Certificate level 2: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption * using HTTP/2 * [HTTP/2] [1] OPENED stream for https://d2uk5hbyrobdzx.cloudfront.net/delta-indexes/bf/cAZWIFcQdvJChXuQgNdHiR4ied_z7RdWJw9oiXMc4.index * [HTTP/2] [1] [:method: HEAD] * [HTTP/2] [1] [:scheme: https] * [HTTP/2] [1] [:authority: d2uk5hbyrobdzx.cloudfront.net] * [HTTP/2] [1] [:path: /delta-indexes/bf/cAZWIFcQdvJChXuQgNdHiR4ied_z7RdWJw9oiXMc4.index] * [HTTP/2] [1] [user-agent: curl/8.6.0] * [HTTP/2] [1] [accept: */*] > HEAD /delta-indexes/bf/cAZWIFcQdvJChXuQgNdHiR4ied_z7RdWJw9oiXMc4.index HTTP/2 > Host: d2uk5hbyrobdzx.cloudfront.net > User-Agent: curl/8.6.0 > Accept: */* > * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4): < HTTP/2 404 HTTP/2 404 < content-type: text/html; charset=iso-8859-1 content-type: text/html; charset=iso-8859-1 < content-length: 196 content-length: 196 < date: Fri, 28 Feb 2025 07:44:28 GMT date: Fri, 28 Feb 2025 07:44:28 GMT < server: Apache server: Apache < strict-transport-security: max-age=31536000; includeSubDomains; preload strict-transport-security: max-age=31536000; includeSubDomains; preload < x-frame-options: SAMEORIGIN x-frame-options: SAMEORIGIN < x-xss-protection: 1; mode=block x-xss-protection: 1; mode=block < x-content-type-options: nosniff x-content-type-options: nosniff < referrer-policy: same-origin referrer-policy: same-origin < strict-transport-security: max-age=31536000; includeSubDomains; preload strict-transport-security: max-age=31536000; includeSubDomains; preload < x-varnish: 31167307 30239695 x-varnish: 31167307 30239695 < via: 1.1 kojipkgs01.iad2.fedoraproject.org (Varnish/7.5), 1.1 11c65b00bf7f76c861a15dcad5558b9c.cloudfront.net (CloudFront) via: 1.1 kojipkgs01.iad2.fedoraproject.org (Varnish/7.5), 1.1 11c65b00bf7f76c861a15dcad5558b9c.cloudfront.net (CloudFront) < apptime: D=992 apptime: D=992 < x-fedora-proxyserver: proxy01.iad2.fedoraproject.org x-fedora-proxyserver: proxy01.iad2.fedoraproject.org < x-fedora-requestid: Z8Fpt00f888AYGkExD76aAAAAMk x-fedora-requestid: Z8Fpt00f888AYGkExD76aAAAAMk < x-cache: Error from cloudfront x-cache: Error from cloudfront < x-amz-cf-pop: FRA56-P11 x-amz-cf-pop: FRA56-P11 < x-amz-cf-id: qH7MWPjHzRl03w-JhZIrHFAk0Ba4bBGCGcDNxaspAe0JQxgsPYYrAg== x-amz-cf-id: qH7MWPjHzRl03w-JhZIrHFAk0Ba4bBGCGcDNxaspAe0JQxgsPYYrAg== < age: 93 age: 93 < * Connection #0 to host d2uk5hbyrobdzx.cloudfront.net left intact
I forced a cloudfront invalidation. let me know if things are better after now or the same?
I guess: https://pagure.io/fedora-infrastructure/issue/12427 fixes this also?
@stkoelle Do you still see the issue?
Yes, i just checked, there are still a view: Time of the screenshot was 15:20 <img alt="shot1.png" src="/fedora-infrastructure/issue/raw/files/69bb2b8872af0e9e5890cffb9c2a4bc5cdc30c44b5bf04bf9aecec86a9fe9a86-shot1.png" />
Log in to comment on this ticket.