#11873 discourse2fedmsg.fedoraproject.org webhook failing again
Closed: Fixed 11 months ago by abompard. Opened 11 months ago by mattdm.

https://discourse2fedmsg.fedoraproject.org/webhook is failing with

error: Net::ReadTimeout with #<TCPSocket:(closed)>

after what seems to be a 20-second timeout every time. Last time this happened, I think the problem was (wait for it!) ... DNS.

https://discussion.fedoraproject.org/admin/api/web_hooks/2


This has been failing for at least a month -- unfortunately it doesn't alert on failure, and it's easy to not notice something not happening....

Metadata Update from @kevin:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: medium-gain, medium-trouble, ops

11 months ago

So, Its not dns. ;)

(famous last words).

I re-ran the playbook and got it to roll out a new build and now it's returning 200! However, I am not sure it's actually working.

It has:

Unhandled error in Deferred:

fedora_messaging.exceptions.ConnectionException

and

fedora_messaging.exceptions.ConnectionException
Traceback (most recent call last):
--- <exception caught here> ---
  File "/opt/app-root/lib64/python3.9/site-packages/fedora_messaging/api.py", line 262, in _twisted_publish
    yield _twisted_service._service.factory.publish(message, exchange=exchange)
  File "/opt/app-root/lib64/python3.9/site-packages/fedora_messaging/twisted/factory.py", line 238, in publish
    protocol = yield self.when_connected()
  File "/opt/app-root/lib64/python3.9/site-packages/fedora_messaging/twisted/factory.py", line 203, in when_connected
    yield self._client_deferred
fedora_messaging.exceptions.ConnectionException: 

[2024-04-09 20:56:45,921] WARNING in webhook: Error sending message 7b5b566e-4b26-46ef-b182-82eff9879e44: 

and

[2024-04-09 21:02:05,565] ERROR in app: Exception on /webhook [POST]
Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.9/site-packages/fedora_messaging/api.py", line 316, in publish
    eventual_result.wait(timeout=timeout)
  File "/opt/app-root/lib64/python3.9/site-packages/crochet/_eventloop.py", line 196, in wait
    result = self._result(timeout)
  File "/opt/app-root/lib64/python3.9/site-packages/crochet/_eventloop.py", line 175, in _result
    raise TimeoutError()
crochet._eventloop.TimeoutError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/opt/app-root/lib64/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/opt/app-root/lib64/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/opt/app-root/lib64/python3.9/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/opt/app-root/lib64/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/app-root/lib64/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/opt/app-root/src/discourse2fedmsg/views/webhook.py", line 67, in webhook
    publish(msg)
  File "/opt/app-root/lib64/python3.9/site-packages/fedora_messaging/api.py", line 324, in publish
    raise wrapper
fedora_messaging.exceptions.PublishTimeout: Publishing timed out after waiting 30 seconds.

Perhaps @abompard or @zlopez could take a closer look?

The RabbitMQ cert for this use has expired on Feb 13 2024... :-/
I'll renew it.

Renewed and redeployed. I'm looking at the logs, there are some calls to the webhook from discord without errors, and we can see the messages in datanommer.

I think the issue is fixed, please reopen if it's not.

Metadata Update from @abompard:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

11 months ago

Log in to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog