#11973 rpm-ostree error (503) from hetzner to https://d2uk5hbyrobdzx.cloudfront.net
Closed: Fixed 5 months ago by kevin. Opened a year ago by stkoelle.

Describe what you would like us to do:

Fedora CoreOS zinacati started to behave very flaky.

It reports 503 errors like:

Jun 05 10:39:27 rs185 zincati[4105]:     error: While pulling 72cf2f80ba1496d478e110d03e1199d9d21382840e96ffeddf4303eb040fbb55: Server returned HTTP 503  

when digging deeper it's coming from rpm-ostree

Jun 05 10:39:21 rs185 rpm-ostree[3686]: libostree HTTP error from remote fedora for <https://d2uk5hbyrobdzx.cloudfront.net/objects/41/5200e439bfc968bb9a26b74e29b0bc7c87c119db9f9749a25172c82afee61a.filez>: Server returned HTTP 503
Jun 05 10:39:23 rs185 rpm-ostree[3686]: libostree HTTP error from remote fedora for <https://d2uk5hbyrobdzx.cloudfront.net/objects/99/601d82918ce51d6267d837cdaba5df5c5448cecf08cd0de61d601a7d27120a.filez>: Server returned HTTP 503
Jun 05 10:39:25 rs185 rpm-ostree[3686]: libostree HTTP error from remote fedora for <https://d2uk5hbyrobdzx.cloudfront.net/objects/db/e3e23f2e97dfac8f3a518e66249d58dea1dd3e102cdfeeb4d34367ec9e1020.filez>: Server returned HTTP 503
Jun 05 10:39:25 rs185 rpm-ostree[3686]: libostree HTTP error from remote fedora for <https://d2uk5hbyrobdzx.cloudfront.net/objects/56/dff8ea8710640a29f53b8de39f8bc11b0ef6559685601b2f8f5679d676f8d0.filez>: Server returned HTTP 503
Jun 05 10:39:27 rs185 rpm-ostree[3686]: libostree HTTP error from remote fedora for <https://d2uk5hbyrobdzx.cloudfront.net/objects/13/06f20ecd9aa85efc08c51ffdc7ee51fdfd2242cdc5aa24b11dd1fd171d3975.filez>: Server returned HTTP 503
Jun 05 10:39:27 rs185 rpm-ostree[3686]: Txn Deploy on /org/projectatomic/rpmostree1/fedora_coreos failed: While pulling 72cf2f80ba1496d478e110d03e1199d9d21382840e96ffeddf4303eb040fbb55: Server returned HTTP 503

when using curl for those urls it works:

curl -v https://d2uk5hbyrobdzx.cloudfront.net/objects/13/06f20ecd9aa85efc08c51ffdc7ee51fdfd2242cdc5aa24b11dd1fd171d3975.filez
* Host d2uk5hbyrobdzx.cloudfront.net:443 was resolved.
* IPv6: (none)
* IPv4: 18.244.20.208, 18.244.20.19, 18.244.20.87, 18.244.20.212
*   Trying 18.244.20.208:443...
* Connected to d2uk5hbyrobdzx.cloudfront.net (18.244.20.208) port 443
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/pki/tls/certs/ca-bundle.crt
*  CApath: none
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / x25519 / RSASSA-PSS
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=*.cloudfront.net
*  start date: Oct 10 00:00:00 2023 GMT
*  expire date: Sep 19 23:59:59 2024 GMT
*  subjectAltName: host "d2uk5hbyrobdzx.cloudfront.net" matched cert's "*.cloudfront.net"
*  issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M01
*  SSL certificate verify ok.
*   Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
*   Certificate level 1: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
*   Certificate level 2: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://d2uk5hbyrobdzx.cloudfront.net/objects/13/06f20ecd9aa85efc08c51ffdc7ee51fdfd2242cdc5aa24b11dd1fd171d3975.filez
* [HTTP/2] [1] [:method: GET]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: d2uk5hbyrobdzx.cloudfront.net]
* [HTTP/2] [1] [:path: /objects/13/06f20ecd9aa85efc08c51ffdc7ee51fdfd2242cdc5aa24b11dd1fd171d3975.filez]
* [HTTP/2] [1] [user-agent: curl/8.6.0]
* [HTTP/2] [1] [accept: */*]
> GET /objects/13/06f20ecd9aa85efc08c51ffdc7ee51fdfd2242cdc5aa24b11dd1fd171d3975.filez HTTP/2
> Host: d2uk5hbyrobdzx.cloudfront.net
> User-Agent: curl/8.6.0
> Accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
< HTTP/2 200
< content-length: 15160
< server: Apache
< strict-transport-security: max-age=31536000; includeSubDomains; preload
< x-frame-options: SAMEORIGIN
< x-xss-protection: 1; mode=block
< x-content-type-options: nosniff
< referrer-policy: same-origin
< strict-transport-security: max-age=31536000; includeSubDomains; preload
< last-modified: Mon, 29 Apr 2024 04:50:48 GMT
< apptime: D=3711
< x-fedora-appserver: kojipkgs02.iad2.fedoraproject.org
< x-varnish: 299826652
< via: 1.1 kojipkgs02.iad2.fedoraproject.org (Varnish/7.3), 1.1 93f1c701362eb59a676baaac7ea81bd8.cloudfront.net (CloudFront)
< accept-ranges: bytes
< x-fedora-proxyserver: proxy01.iad2.fedoraproject.org
< x-fedora-requestid: ZlqsG5WluXd4lnAadIKB5gAAA44
< date: Thu, 06 Jun 2024 05:08:17 GMT
< x-cache: Hit from cloudfront
< x-amz-cf-pop: FRA56-P11
< x-amz-cf-id: bzydx-VPNTc2ypEluova1rfZ8VJVAbXq28_nBilZFIe3haM2hmT6lA==
< age: 11394
<
Warning: Binary output can mess up your terminal. Use "--output -" to tell
Warning: curl to output it to your terminal anyway, or consider "--output
Warning: <FILE>" to save to a file.
* Failure writing output to destination
* Connection #0 to host d2uk5hbyrobdzx.cloudfront.net left intact

I find it interesting that rpm-ostree complains about 503 for different URL on each run.
Is this by design? Otherwise this might be difficult for the CDN.

When do you need this to be done by? (YYYY/MM/DD)


~ 2 month?


Work around: restart zincati in a loop, it will work every other time:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
#!/bin/bash

# Function to restart the zincati service
restart_service() {
    echo "HTTP 503 error detected. Restarting zincati.service..."
    systemctl restart zincati.service
    if [ $? -eq 0 ]; then
        echo "zincati.service restarted successfully."
    else
        echo "Failed to restart zincati.service."
    fi
}

# Monitor the journal logs for zincati.service
journalctl -fu zincati.service | while read -r line; do
    if echo "$line" | grep -q "Server returned HTTP 503"; then
        restart_service
    fi
done

Metadata Update from @zlopez:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: Needs investigation

a year ago

What version of rpm-ostree do you have there?

fcos: 40.20240504.3.0 and 40.20240416.3.1
this is rpm-ostree 2024.5

ok, in ostree 2024.6 there's a fix to retry on 5xx errors (this was intended, but it wasn't properly doing so). I don't know if there's any way for you to test 2024.6?

Also, I looked at cloudfront stats. There's been a pretty constant 0.02% error rate. I am not sure why it's hitting you so often, but it's a really small percentage and it's seemingly always been there. The change we made to haproxy a while back didn't seem to change this any.

@stkoelle so let's see if the problem persists once you get onto a version of FCOS with rpm-ostree 2024.6 ?

I think there is another underlying problem/bug with download of rpm-ostree. I will monitor further to pinpoint it better. I have a success rate of around 10%.

Metadata Update from @zlopez:
- Issue priority set to: Waiting on Reporter (was: Waiting on Assignee)

a year ago

Still I get many 503 just right now:
Jun 20 09:21:33 rs184 zincati[6632]: [ERROR zincati::update_agent::actor] failed to stage deployment: rpm-ostree deploy failed:
Jun 20 09:21:33 rs184 zincati[6632]: error: While pulling a65ed051ae3c7ae658f19bee19ff36be19723070282305382890a793904f6f5e: Server returned HTTP 503

The cloudfront stat seems not to cover that somehow.

Yeah, it's very odd. Are you still seeing it now?

Can you do a traceroute / ping to the CF endpoint and see if it's something between you and it?

For now the problem is gone.

Metadata Update from @stkoelle:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

10 months ago

Problem is still here:

rned HTTP 503
Sep 12 12:11:17 bitbucketrunner1 rpm-ostree[1119979]: libostree HTTP error from remote fedora for <https://d2uk5hbyrobdzx.cloudfront.net/objects/12/ad2630f0905778084a499e5cc7a0d11db16484779374599cb5d9371578fe27.dirtree>: Server returned HTTP 503
Sep 12 12:11:17 bitbucketrunner1 rpm-ostree[1119979]: Txn Deploy on /org/projectatomic/rpmostree1/fedora_coreos failed: While pulling 759c112d3a3d1f762ba368106411fcc4edf8b1c39323aca99269741c88e6a597: Server returned HTTP 503

I am confused by the tracepath, ping looks ok:

tracepath d2uk5hbyrobdzx.cloudfront.net
1?: [LOCALHOST]                      pmtu 1500
1:  ???                                                   0.389ms
1:  ???                                                   0.400ms
2:  core22.fsn1.hetzner.com                               1.108ms
3:  hos-tr4.ex3k9.dc4.fsn1.hetzner.com                    4.888ms
4:  amazon.hetzner.com                                    4.933ms asymm  5
5:  no reply
6:  no reply
7:  no reply
8:  no reply
9:  no reply
10:  no reply
...
25:  no reply
26:  no reply
27:  no reply
28:  no reply
29:  no reply
30:  no reply
Too many hops: pmtu 1500
Resume: pmtu 1500


root@bitbucketrunner1 /var/home/koelle # ping d2uk5hbyrobdzx.cloudfront.net
PING d2uk5hbyrobdzx.cloudfront.net (18.244.20.19) 56(84) bytes of data.
64 bytes from server-18-244-20-19.fra56.r.cloudfront.net (18.244.20.19): icmp_seq=1 ttl=250 time=5.22 ms
64 bytes from server-18-244-20-19.fra56.r.cloudfront.net (18.244.20.19): icmp_seq=2 ttl=250 time=5.17 ms





--- d2uk5hbyrobdzx.cloudfront.net ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 5.173/5.197/5.222/0.024 ms
root@bitbucketrunner1 /var/home/koelle # tracepath server-18-244-20-19.fra56.r.cloudfront.net
1?: [LOCALHOST]                      pmtu 1500
1:  ???                                                   0.485ms
1:  ???                                                   0.368ms
2:  core24.fsn1.hetzner.com                              27.966ms
3:  core1.fra.hetzner.com                                 5.006ms
4:  213-133-113-126.clients.your-server.de                5.946ms asymm  5
5:  no reply
6:  no reply
7:  no reply
8:  no reply
9:  no reply
...
15:  no reply
^C 

Maybe it's a issue with their peering?

Metadata Update from @stkoelle:
- Issue status updated to: Open (was: Closed)

9 months ago

The other day this was on hack news:
https://news.ycombinator.com/item?id=41585249
"Cloudflare misidentifies Hetzner IPs as being located in Iran" it cloudflare, and you are using cloudfront, mut maybe an idea?

Circling back here. Still happening?

Perhaps there's a way to tell it to pull directly from our master mirrors so we can confirm the people somwhow is in cloudfront?

Please reopen if you still are seeing this...

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

5 months ago

I just want to update, that I see the issue:
This time, curl does not get the data either:

Feb 27 10:32:50 cdnode4 zincati[377245]: [INFO  zincati::update_agent::actor] deployment 40.20241019.3.0 (6df70065620571076f242857b9080d747891e2279dff3ed1756270f6889731ce) will be excluded from being a future update target
Feb 27 10:32:50 cdnode4 zincati[377245]: [INFO  zincati::update_agent::actor] initialization complete, auto-updates logic enabled
Feb 27 10:32:50 cdnode4 zincati[377245]: [INFO  zincati::strategy] update strategy: immediate
Feb 27 10:32:50 cdnode4 zincati[377245]: [INFO  zincati::update_agent::actor] reached steady state, periodically polling for updates
Feb 27 10:32:50 cdnode4 systemd[1]: Started zincati.service - Zincati Update Agent.
Feb 27 10:32:50 cdnode4 zincati[377245]: [INFO  zincati::cincinnati] current release detected as not a dead-end
Feb 27 10:32:50 cdnode4 zincati[377245]: [INFO  zincati::update_agent::actor] target release '41.20250130.3.0' selected, proceeding to stage it
Feb 27 10:33:15 cdnode4 zincati[377245]: [ERROR zincati::update_agent::actor] failed to stage deployment: rpm-ostree deploy failed:
Feb 27 10:33:15 cdnode4 zincati[377245]:     error: While pulling d881f2c35bdd4e90960a026b7a161bed82a660da71f484c97f0655776ed9307a: While fetching https://d2uk5hbyrobdzx.cloudfront.net/objects/c7/76fa073f490b9b0d48dc7668289c669d9e8977f89929cc6604173e996c2379.filez: Server returned HTTP 502

...restart zincat ... => new http error code

Feb 27 10:38:58 cdnode4 zincati[378386]:     retrying in 4s
Feb 27 10:39:02 cdnode4 zincati[378386]: [INFO  zincati::update_agent::actor] found 1 other finalized deployment
Feb 27 10:39:02 cdnode4 zincati[378386]: [INFO  zincati::update_agent::actor] deployment 40.20241019.3.0 (6df70065620571076f242857b9080d747891e2279dff3ed1756270f6889731ce) will be excluded from being a future update target
Feb 27 10:39:02 cdnode4 zincati[378386]: [INFO  zincati::update_agent::actor] initialization complete, auto-updates logic enabled
Feb 27 10:39:02 cdnode4 zincati[378386]: [INFO  zincati::strategy] update strategy: immediate
Feb 27 10:39:02 cdnode4 zincati[378386]: [INFO  zincati::update_agent::actor] reached steady state, periodically polling for updates
Feb 27 10:39:02 cdnode4 systemd[1]: Started zincati.service - Zincati Update Agent.
Feb 27 10:39:02 cdnode4 zincati[378386]: [INFO  zincati::cincinnati] current release detected as not a dead-end
Feb 27 10:39:03 cdnode4 zincati[378386]: [INFO  zincati::update_agent::actor] target release '41.20250130.3.0' selected, proceeding to stage it
Feb 27 10:39:15 cdnode4 zincati[378386]: [ERROR zincati::update_agent::actor] failed to stage deployment: rpm-ostree deploy failed:
Feb 27 10:39:15 cdnode4 zincati[378386]:     error: While pulling d881f2c35bdd4e90960a026b7a161bed82a660da71f484c97f0655776ed9307a: While fetching https://d2uk5hbyrobdzx.cloudfront.net/objects/a5/c6939921ab3ce95a467ec7db8aa36781748c634dda443bb5b840ad77a584f3.filez: Server returned HTTP 504
Feb 27 10:39:15 cdnode4 zincati[378386]
curl "https://d2uk5hbyrobdzx.cloudfront.net/delta-indexes/2I/Hyw1vdTpCWCgJrehYb7YKmYNpx9ITJfwZVd27ZMHo.index"
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL was not found on this server.</p>
</body></html>
curl "https://d2uk5hbyrobdzx.cloudfront.net/delta-indexes/2I/Hyw1vdTpCWCgJrehYb7YKmYNpx9ITJfwZVd27ZMHo.index"
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML><HEAD><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<TITLE>ERROR: The request could not be satisfied</TITLE>
</HEAD><BODY>
<H1>502 ERROR</H1>
<H2>The request could not be satisfied.</H2>
<HR noshade size="1px">
CloudFront attempted to establish a connection with the origin, but either the attempt failed or the origin closed the connection.
We can't connect to the server for this app or website at this time. There might be too much traffic or a configuration error. Try again later, or contact the app or website owner.
<BR clear="all">
If you provide content to customers through CloudFront, you can find steps to troubleshoot and help prevent this error by reviewing the CloudFront documentation.
<BR clear="all">
<HR noshade size="1px">
<PRE>
Generated by cloudfront (CloudFront)
Request ID: Z0NRLb7xFL2PnD4oeew3drp6C9d1hUbRZdn8n44sJTqjBnG5Iwr8_w==
</PRE>
<ADDRESS>
</ADDRESS>
</BODY><

some more strange looking logs:

[ERROR zincati::update_agent::actor] failed to stage deployment: rpm-ostree deploy failed:
Feb 27 12:07:13 cdnode4 zincati[999597]:     error: While pulling d881f2c35bdd4e90960a026b7a161bed82a660da71f484c97f0655776ed9307a: While fetching https://d2uk5hbyrobdzx.cloudfront.net/objects/bc/6cad566536d8e6d97adb35974415e9aba465002e6075a7a8d47a23fdf0d801.filez: Server returned HTTP 502
Feb 27 12:07:13 cdnode4 zincati[999597]:
^C
root@cdnode4 /var/home/koelle # journalctl -fu zincati.service^C
root@cdnode4 /var/home/koelle # curl -Iv "https://d2uk5hbyrobdzx.cloudfront.net/objects/df/83130da5f9c65165249b100b4e5197e0817ccca26bd4b3659a311ae0743062.filez"
* Host d2uk5hbyrobdzx.cloudfront.net:443 was resolved.
* IPv6: (none)
* IPv4: 18.244.20.87, 18.244.20.212, 18.244.20.208, 18.244.20.19
*   Trying 18.244.20.87:443...
* Connected to d2uk5hbyrobdzx.cloudfront.net (18.244.20.87) port 443
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/pki/tls/certs/ca-bundle.crt
*  CApath: none
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / x25519 / RSASSA-PSS
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=*.cloudfront.net
*  start date: Jul 30 00:00:00 2024 GMT
*  expire date: Jul  3 23:59:59 2025 GMT
*  subjectAltName: host "d2uk5hbyrobdzx.cloudfront.net" matched cert's "*.cloudfront.net"
*  issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M01
*  SSL certificate verify ok.
*   Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
*   Certificate level 1: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
*   Certificate level 2: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://d2uk5hbyrobdzx.cloudfront.net/objects/df/83130da5f9c65165249b100b4e5197e0817ccca26bd4b3659a311ae0743062.filez
* [HTTP/2] [1] [:method: HEAD]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: d2uk5hbyrobdzx.cloudfront.net]
* [HTTP/2] [1] [:path: /objects/df/83130da5f9c65165249b100b4e5197e0817ccca26bd4b3659a311ae0743062.filez]
* [HTTP/2] [1] [user-agent: curl/8.9.1]
* [HTTP/2] [1] [accept: */*]
> HEAD /objects/df/83130da5f9c65165249b100b4e5197e0817ccca26bd4b3659a311ae0743062.filez HTTP/2
> Host: d2uk5hbyrobdzx.cloudfront.net
> User-Agent: curl/8.9.1
> Accept: */*
>
* Request completely sent off
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
< HTTP/2 200
HTTP/2 200
< content-length: 2948
content-length: 2948
< date: Thu, 27 Feb 2025 12:07:32 GMT
date: Thu, 27 Feb 2025 12:07:32 GMT
< server: Apache
server: Apache
< strict-transport-security: max-age=31536000; includeSubDomains; preload
strict-transport-security: max-age=31536000; includeSubDomains; preload
< x-frame-options: SAMEORIGIN
x-frame-options: SAMEORIGIN
< x-xss-protection: 1; mode=block
x-xss-protection: 1; mode=block
< x-content-type-options: nosniff
x-content-type-options: nosniff
< referrer-policy: same-origin
referrer-policy: same-origin
< strict-transport-security: max-age=31536000; includeSubDomains; preload
strict-transport-security: max-age=31536000; includeSubDomains; preload
< last-modified: Sun, 19 Jan 2025 02:25:07 GMT
last-modified: Sun, 19 Jan 2025 02:25:07 GMT
< apptime: D=3073
apptime: D=3073
< x-fedora-appserver: kojipkgs01.iad2.fedoraproject.org
x-fedora-appserver: kojipkgs01.iad2.fedoraproject.org
< x-varnish: 13609567
x-varnish: 13609567
< via: 1.1 kojipkgs01.iad2.fedoraproject.org (Varnish/7.5), 1.1 6571e9f709b2287f8a30275c17d07140.cloudfront.net (CloudFront)
via: 1.1 kojipkgs01.iad2.fedoraproject.org (Varnish/7.5), 1.1 6571e9f709b2287f8a30275c17d07140.cloudfront.net (CloudFront)
< accept-ranges: bytes
accept-ranges: bytes
< x-fedora-proxyserver: proxy10.iad2.fedoraproject.org
x-fedora-proxyserver: proxy10.iad2.fedoraproject.org
< x-fedora-requestid: Z8BVhJYPmPT55Ej6IRC3AAAAAkg
x-fedora-requestid: Z8BVhJYPmPT55Ej6IRC3AAAAAkg
< x-cache: Miss from cloudfront
x-cache: Miss from cloudfront
< x-amz-cf-pop: FRA56-P11
x-amz-cf-pop: FRA56-P11
< x-amz-cf-id: X9102u23mS24MoqmUqrcJknKtg3hbd-xBhgfCCSowk2pzwpd_88dsA==
x-amz-cf-id: X9102u23mS24MoqmUqrcJknKtg3hbd-xBhgfCCSowk2pzwpd_88dsA==
< age: 0
age: 0
<

after curl somewhat successfully fetched the file, zincati could fetch it also.
What is different? UA, http version?

Here is a 504 with curl:

root@cdnode4 /var/home/koelle # curl -Iv "https://d2uk5hbyrobdzx.cloudfront.net/delta-indexes/2I/Hyw1vdTpCWCgJrehYb7YKmYNpx9ITJfwZVd27ZMHo.index"
* Host d2uk5hbyrobdzx.cloudfront.net:443 was resolved.
* IPv6: (none)
* IPv4: 18.244.20.87, 18.244.20.208, 18.244.20.212, 18.244.20.19
*   Trying 18.244.20.87:443...
* Connected to d2uk5hbyrobdzx.cloudfront.net (18.244.20.87) port 443
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/pki/tls/certs/ca-bundle.crt
*  CApath: none
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / x25519 / RSASSA-PSS
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=*.cloudfront.net
*  start date: Jul 30 00:00:00 2024 GMT
*  expire date: Jul  3 23:59:59 2025 GMT
*  subjectAltName: host "d2uk5hbyrobdzx.cloudfront.net" matched cert's "*.cloudfront.net"
*  issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M01
*  SSL certificate verify ok.
*   Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
*   Certificate level 1: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
*   Certificate level 2: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://d2uk5hbyrobdzx.cloudfront.net/delta-indexes/2I/Hyw1vdTpCWCgJrehYb7YKmYNpx9ITJfwZVd27ZMHo.index
* [HTTP/2] [1] [:method: HEAD]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: d2uk5hbyrobdzx.cloudfront.net]
* [HTTP/2] [1] [:path: /delta-indexes/2I/Hyw1vdTpCWCgJrehYb7YKmYNpx9ITJfwZVd27ZMHo.index]
* [HTTP/2] [1] [user-agent: curl/8.9.1]
* [HTTP/2] [1] [accept: */*]
> HEAD /delta-indexes/2I/Hyw1vdTpCWCgJrehYb7YKmYNpx9ITJfwZVd27ZMHo.index HTTP/2
> Host: d2uk5hbyrobdzx.cloudfront.net
> User-Agent: curl/8.9.1
> Accept: */*
>
* Request completely sent off
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
< HTTP/2 504
HTTP/2 504
< content-type: text/html
content-type: text/html
< content-length: 1033
content-length: 1033
< server: CloudFront
server: CloudFront
< date: Thu, 27 Feb 2025 12:14:49 GMT
date: Thu, 27 Feb 2025 12:14:49 GMT
< x-cache: Error from cloudfront
x-cache: Error from cloudfront
< via: 1.1 5034084c037ff19008ba7c2c0b849a4c.cloudfront.net (CloudFront)
via: 1.1 5034084c037ff19008ba7c2c0b849a4c.cloudfront.net (CloudFront)
< x-amz-cf-pop: FRA56-P11
x-amz-cf-pop: FRA56-P11
< x-amz-cf-id: YKxC0sRdNHgNrJXY5GMwUzitgTcsFxtKvKIPVgPm-_jOVZgfycDhZQ==
x-amz-cf-id: YKxC0sRdNHgNrJXY5GMwUzitgTcsFxtKvKIPVgPm-_jOVZgfycDhZQ==
<

* Connection #0 to host d2uk5hbyrobdzx.cloudfront.net left intact

varnish says:

< via: 1.1 kojipkgs02.iad2.fedoraproject.org (Varnish/7.5), 1.1 74ca1b9f17cb4adcfc54f8b84ccc7d82.cloudfront.net (CloudFront)

I don't see a DNS record for: kojipkgs02.iad2.fedoraproject.org

Yes, thats an internal cache server...

Some folks opened a new issue a bit ago that might be related here:
https://pagure.io/fedora-infrastructure/issue/12427

But I think this is all due to
https://pagure.io/fedora-infrastructure/issue/12419
which was fixed last night.

Are you still seeing any issues? The cache instablity definitely would have affected things duing that window.

I still see 500 errors:

Feb 28 07:40:25 clickhouse1 zincati[13411]: [ERROR zincati::update_agent::actor] failed to stage deployment: rpm-ostree deploy failed:
Feb 28 07:40:25 clickhouse1 zincati[13411]:     error: While pulling 6df70065620571076f242857b9080d747891e2279dff3ed1756270f6889731ce: While fetching https://d2uk5hbyrobdzx.cloudfront.net/objects/b7/1383f510fa9b0044cfea9e7c684d6f2bcb8f53f446abd0dd8f339203c711bc.filez: Server returned HTTP 502

One more server log of zincati + curling the url

eb 28 07:45:29 clickhouse1 zincati[14785]: [INFO  zincati::update_agent::actor] target release '40.20241019.3.0' selected, proceeding to stage it
Feb 28 07:45:29 clickhouse1 zincati[14785]: [ERROR zincati::update_agent::actor] failed to stage deployment: rpm-ostree deploy failed:
Feb 28 07:45:29 clickhouse1 zincati[14785]:     error: While pulling 6df70065620571076f242857b9080d747891e2279dff3ed1756270f6889731ce: While fetching https://d2uk5hbyrobdzx.cloudfront.net/delta-indexes/bf/cAZWIFcQdvJChXuQgNdHiR4ied_z7RdWJw9oiXMc4.index: Server returned HTTP 502
Feb 28 07:45:29 clickhouse1 zincati[14785]:
root@clickhouse1 /home/wildfly/epoq-podman # curl -Iv "https://d2uk5hbyrobdzx.cloudfront.net/delta-indexes/bf/cAZWIFcQdvJChXuQgNdHiR4ied_z7RdWJw9oiXMc4.index"
* Host d2uk5hbyrobdzx.cloudfront.net:443 was resolved.
* IPv6: (none)
* IPv4: 18.244.20.212, 18.244.20.19, 18.244.20.87, 18.244.20.208
*   Trying 18.244.20.212:443...
* Connected to d2uk5hbyrobdzx.cloudfront.net (18.244.20.212) port 443
* ALPN: curl offers h2,http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/pki/tls/certs/ca-bundle.crt
*  CApath: none
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256 / x25519 / RSASSA-PSS
* ALPN: server accepted h2
* Server certificate:
*  subject: CN=*.cloudfront.net
*  start date: Jul 30 00:00:00 2024 GMT
*  expire date: Jul  3 23:59:59 2025 GMT
*  subjectAltName: host "d2uk5hbyrobdzx.cloudfront.net" matched cert's "*.cloudfront.net"
*  issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M01
*  SSL certificate verify ok.
*   Certificate level 0: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
*   Certificate level 1: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
*   Certificate level 2: Public key type RSA (2048/112 Bits/secBits), signed using sha256WithRSAEncryption
* using HTTP/2
* [HTTP/2] [1] OPENED stream for https://d2uk5hbyrobdzx.cloudfront.net/delta-indexes/bf/cAZWIFcQdvJChXuQgNdHiR4ied_z7RdWJw9oiXMc4.index
* [HTTP/2] [1] [:method: HEAD]
* [HTTP/2] [1] [:scheme: https]
* [HTTP/2] [1] [:authority: d2uk5hbyrobdzx.cloudfront.net]
* [HTTP/2] [1] [:path: /delta-indexes/bf/cAZWIFcQdvJChXuQgNdHiR4ied_z7RdWJw9oiXMc4.index]
* [HTTP/2] [1] [user-agent: curl/8.6.0]
* [HTTP/2] [1] [accept: */*]
> HEAD /delta-indexes/bf/cAZWIFcQdvJChXuQgNdHiR4ied_z7RdWJw9oiXMc4.index HTTP/2
> Host: d2uk5hbyrobdzx.cloudfront.net
> User-Agent: curl/8.6.0
> Accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
< HTTP/2 404
HTTP/2 404
< content-type: text/html; charset=iso-8859-1
content-type: text/html; charset=iso-8859-1
< content-length: 196
content-length: 196
< date: Fri, 28 Feb 2025 07:44:28 GMT
date: Fri, 28 Feb 2025 07:44:28 GMT
< server: Apache
server: Apache
< strict-transport-security: max-age=31536000; includeSubDomains; preload
strict-transport-security: max-age=31536000; includeSubDomains; preload
< x-frame-options: SAMEORIGIN
x-frame-options: SAMEORIGIN
< x-xss-protection: 1; mode=block
x-xss-protection: 1; mode=block
< x-content-type-options: nosniff
x-content-type-options: nosniff
< referrer-policy: same-origin
referrer-policy: same-origin
< strict-transport-security: max-age=31536000; includeSubDomains; preload
strict-transport-security: max-age=31536000; includeSubDomains; preload
< x-varnish: 31167307 30239695
x-varnish: 31167307 30239695
< via: 1.1 kojipkgs01.iad2.fedoraproject.org (Varnish/7.5), 1.1 11c65b00bf7f76c861a15dcad5558b9c.cloudfront.net (CloudFront)
via: 1.1 kojipkgs01.iad2.fedoraproject.org (Varnish/7.5), 1.1 11c65b00bf7f76c861a15dcad5558b9c.cloudfront.net (CloudFront)
< apptime: D=992
apptime: D=992
< x-fedora-proxyserver: proxy01.iad2.fedoraproject.org
x-fedora-proxyserver: proxy01.iad2.fedoraproject.org
< x-fedora-requestid: Z8Fpt00f888AYGkExD76aAAAAMk
x-fedora-requestid: Z8Fpt00f888AYGkExD76aAAAAMk
< x-cache: Error from cloudfront
x-cache: Error from cloudfront
< x-amz-cf-pop: FRA56-P11
x-amz-cf-pop: FRA56-P11
< x-amz-cf-id: qH7MWPjHzRl03w-JhZIrHFAk0Ba4bBGCGcDNxaspAe0JQxgsPYYrAg==
x-amz-cf-id: qH7MWPjHzRl03w-JhZIrHFAk0Ba4bBGCGcDNxaspAe0JQxgsPYYrAg==
< age: 93
age: 93

<
* Connection #0 to host d2uk5hbyrobdzx.cloudfront.net left intact

I forced a cloudfront invalidation. let me know if things are better after now or the same?

@stkoelle Do you still see the issue?

Log in to comment on this ticket.

Metadata
Attachments 2
Attached 11 months ago View Comment
Attached 3 months ago View Comment