MBS does not accept new builds:
$ time fedpkg module-build Submitting the module build... Could not execute module_build: HTTPSConnectionPool(host='mbs.fedoraproject.org', port=443): Read timed out. (read timeout=120) real 2m1,900s user 0m0,503s sys 0m0,114s
Though the end-point can be read:
$ HEAD https://mbs.fedoraproject.org:443/ 200 OK Connection: close Date: Thu, 27 May 2021 12:27:06 GMT Server: Apache Content-Length: 38688 Content-Type: application/json Access-Control-Allow-Origin: * Apptime: D=1542490 Client-Date: Thu, 27 May 2021 12:27:07 GMT
@breilly Could you please take a look at this? Thanks.
Metadata Update from @mohanboddu: - Issue priority set to: Waiting on Assignee (was: Needs Review) - Issue tagged with: medium-gain, medium-trouble, ops
Metadata Update from @mizdebsk: - Issue tagged with: mbs
I see a number of builds now... is this problem still happening?
What module were you trying to build?
Metadata Update from @kevin: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
That peculiar. I still experience the problem. I'm building perl-CGI:4.51:
$ fedpkg --verbose --debug module-build Creating repo object from /home/petr/fedora-modules/perl-CGI Submitting the module build... Could not execute module_build: HTTPSConnectionPool(host='mbs.fedoraproject.org', port=443): Read timed out. (read timeout=120) Traceback (most recent call last): File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 421, in _make_request six.raise_from(e, None) File "<string>", line 3, in raise_from File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 416, in _make_request httplib_response = conn.getresponse() File "/usr/lib64/python3.9/http/client.py", line 1345, in getresponse response.begin() File "/usr/lib64/python3.9/http/client.py", line 307, in begin version, status, reason = self._read_status() File "/usr/lib64/python3.9/http/client.py", line 268, in _read_status line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") File "/usr/lib64/python3.9/socket.py", line 704, in readinto return self._sock.recv_into(b) File "/usr/lib64/python3.9/ssl.py", line 1241, in recv_into return self.read(nbytes, buffer) File "/usr/lib64/python3.9/ssl.py", line 1099, in read return self._sslobj.read(len, buffer) socket.timeout: The read operation timed out During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.9/site-packages/requests/adapters.py", line 439, in send resp = conn.urlopen( File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 719, in urlopen retries = retries.increment( File "/usr/lib/python3.9/site-packages/urllib3/util/retry.py", line 400, in increment raise six.reraise(type(error), error, _stacktrace) File "/usr/lib/python3.9/site-packages/urllib3/packages/six.py", line 703, in reraise raise value File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 665, in urlopen httplib_response = self._make_request( File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 423, in _make_request self._raise_timeout(err=e, url=url, timeout_value=read_timeout) File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 330, in _raise_timeout raise ReadTimeoutError( urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='mbs.fedoraproject.org', port=443): Read timed out. (read timeout=120) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/bin/fedpkg", line 33, in <module> sys.exit(load_entry_point('fedpkg==1.40', 'console_scripts', 'fedpkg')()) File "/usr/lib/python3.9/site-packages/fedpkg/__main__.py", line 87, in main sys.exit(client.args.command()) File "/usr/lib/python3.9/site-packages/pyrpkg/cli.py", line 2285, in module_build build_ids = self._cmd.module_submit_build( File "/usr/lib/python3.9/site-packages/pyrpkg/__init__.py", line 3782, in module_submit_build resp = self.module_send_authorized_request( File "/usr/lib/python3.9/site-packages/pyrpkg/__init__.py", line 3693, in module_send_authorized_request resp = oidc.send_request( File "/usr/lib/python3.9/site-packages/openidc_client/__init__.py", line 232, in send_request resp = requests.request(method, *args, **ckwargs) File "/usr/lib/python3.9/site-packages/requests/api.py", line 61, in request return session.request(method=method, url=url, **kwargs) File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 530, in request resp = self.send(prep, **send_kwargs) File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 643, in send r = adapter.send(request, **kwargs) File "/usr/lib/python3.9/site-packages/requests/adapters.py", line 529, in send raise ReadTimeout(e, request=request) requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='mbs.fedoraproject.org', port=443): Read timed out. (read timeout=120)
While debugging it I started to get 503 HTTP failure from [2605:bc80:3010:600:dead:beef:cafe:fed9]:https:
$ fedpkg module-build Could not execute module_build: The following error occurred while trying to determine the API versions supported by MBS: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> [...] <body class="503"> [...]
And then it again timed out and now it again returns 503. What does happen? Did I kill the backend by repeating the timing out requests?
I suspected an IPv6 Patch MTU discovery issues, so I enforced IPv4 connection, but I still experience the time out. Unfortunatelly this is HTTPS, so I have no idea what happens inside:
16:50:53.025630 IP 10.0.0.226.55084 > 67.219.144.68.https: Flags [S], seq 2013105343, win 64240, options [mss 1460,sackOK,TS val 3344247745 ecr 0,nop,wscale 7], length 0 16:50:53.137014 IP 67.219.144.68.https > 10.0.0.226.55084: Flags [S.], seq 2767071507, ack 2013105344, win 65160, options [mss 1460,sackOK,TS val 708511604 ecr 3344247745,nop,wscale 7], length 0 16:50:53.137110 IP 10.0.0.226.55084 > 67.219.144.68.https: Flags [.], ack 1, win 502, options [nop,nop,TS val 3344247856 ecr 708511604], length 0 16:50:53.162548 IP 10.0.0.226.55084 > 67.219.144.68.https: Flags [P.], seq 1:518, ack 1, win 502, options [nop,nop,TS val 3344247881 ecr 708511604], length 517 16:50:53.282437 IP 67.219.144.68.https > 10.0.0.226.55084: Flags [.], ack 518, win 506, options [nop,nop,TS val 708511750 ecr 3344247881], length 0 16:50:53.309915 IP 67.219.144.68.https > 10.0.0.226.55084: Flags [P.], seq 1:2897, ack 518, win 506, options [nop,nop,TS val 708511773 ecr 3344247881], length 2896 16:50:53.309972 IP 10.0.0.226.55084 > 67.219.144.68.https: Flags [.], ack 2897, win 496, options [nop,nop,TS val 3344248029 ecr 708511773], length 0 16:50:53.309916 IP 67.219.144.68.https > 10.0.0.226.55084: Flags [P.], seq 2897:4057, ack 518, win 506, options [nop,nop,TS val 708511773 ecr 3344247881], length 1160 16:50:53.310013 IP 10.0.0.226.55084 > 67.219.144.68.https: Flags [.], ack 4057, win 489, options [nop,nop,TS val 3344248029 ecr 708511773], length 0 16:50:53.312122 IP 10.0.0.226.55084 > 67.219.144.68.https: Flags [P.], seq 518:598, ack 4057, win 501, options [nop,nop,TS val 3344248031 ecr 708511773], length 80 16:50:53.312834 IP 10.0.0.226.55084 > 67.219.144.68.https: Flags [P.], seq 598:801, ack 4057, win 501, options [nop,nop,TS val 3344248032 ecr 708511773], length 203 16:50:53.424472 IP 67.219.144.68.https > 10.0.0.226.55084: Flags [.], ack 598, win 506, options [nop,nop,TS val 708511892 ecr 3344248031], length 0 16:50:53.424473 IP 67.219.144.68.https > 10.0.0.226.55084: Flags [.], ack 801, win 505, options [nop,nop,TS val 708511892 ecr 3344248032], length 0 16:50:53.424473 IP 67.219.144.68.https > 10.0.0.226.55084: Flags [P.], seq 4057:4360, ack 801, win 505, options [nop,nop,TS val 708511893 ecr 3344248032], length 303 16:50:53.424553 IP 10.0.0.226.55084 > 67.219.144.68.https: Flags [.], ack 4360, win 501, options [nop,nop,TS val 3344248143 ecr 708511893], length 0 16:50:53.424775 IP 67.219.144.68.https > 10.0.0.226.55084: Flags [P.], seq 4360:4663, ack 801, win 505, options [nop,nop,TS val 708511893 ecr 3344248032], length 303 16:50:53.424810 IP 10.0.0.226.55084 > 67.219.144.68.https: Flags [.], ack 4663, win 501, options [nop,nop,TS val 3344248144 ecr 708511893], length 0 16:50:53.481515 IP 67.219.144.68.https > 10.0.0.226.55084: Flags [P.], seq 4663:5343, ack 801, win 505, options [nop,nop,TS val 708511949 ecr 3344248032], length 680 16:50:53.481583 IP 10.0.0.226.55084 > 67.219.144.68.https: Flags [.], ack 5343, win 501, options [nop,nop,TS val 3344248200 ecr 708511949], length 0 16:50:53.483708 IP 10.0.0.226.55084 > 67.219.144.68.https: Flags [F.], seq 801, ack 5343, win 501, options [nop,nop,TS val 3344248203 ecr 708511949], length 0 16:50:53.535871 IP 10.0.0.226.55088 > 67.219.144.68.https: Flags [S], seq 1195844511, win 64240, options [mss 1460,sackOK,TS val 3344248255 ecr 0,nop,wscale 7], length 0 16:50:53.595535 IP 67.219.144.68.https > 10.0.0.226.55084: Flags [P.], seq 5343:5367, ack 802, win 505, options [nop,nop,TS val 708512064 ecr 3344248203], length 24 16:50:53.595595 IP 10.0.0.226.55084 > 67.219.144.68.https: Flags [R], seq 2013106145, win 0, length 0 16:50:53.596109 IP 67.219.144.68.https > 10.0.0.226.55084: Flags [F.], seq 5367, ack 802, win 505, options [nop,nop,TS val 708512064 ecr 3344248203], length 0 16:50:53.596124 IP 10.0.0.226.55084 > 67.219.144.68.https: Flags [R], seq 2013106145, win 0, length 0 16:50:53.646953 IP 67.219.144.68.https > 10.0.0.226.55088: Flags [S.], seq 977630917, ack 1195844512, win 65160, options [mss 1460,sackOK,TS val 708512115 ecr 3344248255,nop,wscale 7], length 0 16:50:53.647038 IP 10.0.0.226.55088 > 67.219.144.68.https: Flags [.], ack 1, win 502, options [nop,nop,TS val 3344248366 ecr 708512115], length 0 16:50:53.670813 IP 10.0.0.226.55088 > 67.219.144.68.https: Flags [P.], seq 1:518, ack 1, win 502, options [nop,nop,TS val 3344248390 ecr 708512115], length 517 16:50:53.784041 IP 67.219.144.68.https > 10.0.0.226.55088: Flags [.], ack 518, win 506, options [nop,nop,TS val 708512251 ecr 3344248390], length 0 16:50:53.805542 IP 67.219.144.68.https > 10.0.0.226.55088: Flags [.], seq 1:1449, ack 518, win 506, options [nop,nop,TS val 708512273 ecr 3344248390], length 1448 16:50:53.805600 IP 10.0.0.226.55088 > 67.219.144.68.https: Flags [.], ack 1449, win 501, options [nop,nop,TS val 3344248524 ecr 708512273], length 0 16:50:53.805800 IP 67.219.144.68.https > 10.0.0.226.55088: Flags [P.], seq 1449:2897, ack 518, win 506, options [nop,nop,TS val 708512273 ecr 3344248390], length 1448 16:50:53.805825 IP 10.0.0.226.55088 > 67.219.144.68.https: Flags [.], ack 2897, win 496, options [nop,nop,TS val 3344248525 ecr 708512273], length 0 16:50:53.805800 IP 67.219.144.68.https > 10.0.0.226.55088: Flags [P.], seq 2897:4057, ack 518, win 506, options [nop,nop,TS val 708512273 ecr 3344248390], length 1160 16:50:53.805840 IP 10.0.0.226.55088 > 67.219.144.68.https: Flags [.], ack 4057, win 489, options [nop,nop,TS val 3344248525 ecr 708512273], length 0 16:50:53.807660 IP 10.0.0.226.55088 > 67.219.144.68.https: Flags [P.], seq 518:598, ack 4057, win 501, options [nop,nop,TS val 3344248527 ecr 708512273], length 80 16:50:53.808481 IP 10.0.0.226.55088 > 67.219.144.68.https: Flags [P.], seq 598:956, ack 4057, win 501, options [nop,nop,TS val 3344248527 ecr 708512273], length 358 16:50:53.808725 IP 10.0.0.226.55088 > 67.219.144.68.https: Flags [P.], seq 956:1120, ack 4057, win 501, options [nop,nop,TS val 3344248528 ecr 708512273], length 164 16:50:53.916962 IP 67.219.144.68.https > 10.0.0.226.55088: Flags [.], ack 598, win 506, options [nop,nop,TS val 708512386 ecr 3344248527], length 0 16:50:53.918100 IP 67.219.144.68.https > 10.0.0.226.55088: Flags [P.], seq 4057:4360, ack 598, win 506, options [nop,nop,TS val 708512386 ecr 3344248527], length 303 16:50:53.918149 IP 10.0.0.226.55088 > 67.219.144.68.https: Flags [.], ack 4360, win 501, options [nop,nop,TS val 3344248637 ecr 708512386], length 0 16:50:53.918392 IP 67.219.144.68.https > 10.0.0.226.55088: Flags [P.], seq 4360:4663, ack 598, win 506, options [nop,nop,TS val 708512386 ecr 3344248527], length 303 16:50:53.918418 IP 10.0.0.226.55088 > 67.219.144.68.https: Flags [.], ack 4663, win 501, options [nop,nop,TS val 3344248637 ecr 708512386], length 0 16:50:53.929275 IP 67.219.144.68.https > 10.0.0.226.55088: Flags [.], ack 956, win 504, options [nop,nop,TS val 708512398 ecr 3344248527], length 0 16:50:53.929275 IP 67.219.144.68.https > 10.0.0.226.55088: Flags [.], ack 1120, win 503, options [nop,nop,TS val 708512398 ecr 3344248528], length 0 [Here it times out] 16:52:53.910519 IP 10.0.0.226.55088 > 67.219.144.68.https: Flags [F.], seq 1120, ack 4663, win 501, options [nop,nop,TS val 3344368629 ecr 708512398], length 0 16:52:54.065205 IP 67.219.144.68.https > 10.0.0.226.55088: Flags [.], ack 1121, win 503, options [nop,nop,TS val 708632532 ecr 3344368629], length 0
Is there a way od debugging the fedpkg communication?
Actually just a few minutes ago it stopped responding at all... and I had to reboot the vm. ;(
Does it seem any better now?
Still 503.
It completely crashed again. ;(
I doubled it's memory and cpus.
See now?
Now it's alive. I located the only place in /usr/lib/python3.9/site-packages/pyrpkg/init.py which sets timeout to 120 second. It's in module_send_authorized_request() method. I added a debugging like this:
url = self.module_get_url(build_id=None, action='POST') → print("HIT module_send_authorized_request()", 'POST', url, body, auth_method, oidc_id_provider, oidc_client_id, oidc_client_secret, oidc_scopes, 120) resp = self.module_send_authorized_request( 'POST', url, body, auth_method, oidc_id_provider, oidc_client_id, oidc_client_secret, oidc_scopes, timeout=120) print("FINISHED")
and executed it:
$ fedpkg module-build Submitting the module build... HIT module_send_authorized_request() POST https://mbs.fedoraproject.org/module-build-service/2/module-builds/ {'scmurl': 'https://src.fedoraproject.org/modules/perl-CGI.git?#eb6b8198854b1e57a7cac9ae191b23e76a9961f5', 'branch': '4.51', 'scratch': False} oidc https://id.fedoraproject.org/openidc/ mbs-authorizer notsecret ['openid', 'https://id.fedoraproject.org/scope/groups', 'https://mbs.fedoraproject.org/oidc/submit-build'] 120
And it halts there and after a timeout fedpkg prints:
Could not execute module_build: HTTPSConnectionPool(host='mbs.fedoraproject.org', port=443): Read timed out. (read timeout=120)
So I identified the problematic request. I won't retry it again to keep it working for others. But please forward this issue to MBS maintainers. The https://src.fedoraproject.org/modules/perl-CGI.git?#eb6b8198854b1e57a7cac9ae191b23e76a9961f5 exhibits a new module format and MBS might experience a leak or a loop on it.
I see what the problem is - this is a build using the newest packagerv3 format. MBS Fedora deployments are a bit behind, so the changes required are yet to be deployed. I'll see about getting a newer MBS deployed ASAP.
Metadata Update from @ppisar: - Issue status updated to: Open (was: Closed)
@ppisar the latest MBS has been deployed to prod, it should be able to handle PackagerV3 module builds now.
Thanks. I confirm that now it accepts the build. Though it's sluggish – it took 74 seconds to accept it. Maybe there is still something suboptimal in MBS.
MBS tasks 12151 and 12152 got stuck after building all components. When cancelling 12151, I get a timeout:
$ fedpkg module-build-cancel 12151 Cancelling module build #12151... Could not execute module_build_cancel: HTTPSConnectionPool(host='mbs.fedoraproject.org', port=443): Read timed out. (read timeout=60)
Cancelling 12152 succeeded.
Can you resolve it?
I tried canceling 12151 again and it reported that's already cancelled. Then I tried resuming the builds again to finish them and the submission failed with "500 Internal Server Error". Submitted again and it accepted. MBS is faulty.
And again. Cancelling 12151 passed. Cancelling 12152 timed out 4 times. And now 500 for any request.
@ppisar I'm seeing a repeated error which is likely causing the task to hang: TagError: Package perl-CGI not in list for eln-modular-updates-candidate
TagError: Package perl-CGI not in list for eln-modular-updates-candidate
Can you confirm that package is in the list?
Not sure about the 500s yet, investigating that.
perl-CGI component is not on the list:
$ koji list-pkgs --tag eln-modular-updates-candidate --show-blocked |grep perl perl-Module-Install eln-modular releng perl-Module-Install-bootstrap eln-modular releng perl-Module-Package eln-modular releng perl eln-modular releng perl-IO-Socket-SSL eln-modular releng perl-Moo eln-modular releng perl-Unicode-EastAsianWidth eln-modular releng perl-libintl-perl eln-modular releng perl-bootstrap eln-modular releng perl-DBD-MySQL eln-modular releng perl-FCGI eln-modular releng perl-DBI eln-modular releng perl-DBD-Pg eln-modular releng perl-DBD-SQLite eln-modular releng perl-YAML-bootstrap eln-modular releng perl-App-cpanminus eln-modular releng perl-YAML eln-modular releng perl-libwww-perl eln-modular releng
I did not know that. I will stop building modules for components not listed there. It would be great if MBS forwarded the tagging error instead of hanging on it.
Agreed, I filed an issue to change the behavior there. @ppisar are you still experiencing issues with the build?
After excluding platform:eln context, I do not experience issues with the build.
Shall we keep this open to track things? Or close it in favor of the above issue?
@breilly any preference?
Lets just close this and track upstream...
Metadata Update from @kevin: - Issue close_status updated to: Upstream - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.