This error started happening when the fedmsg plugin was finally started. It seems that there was maybe some change in internal message schema used by mailman and it needs to be updated according to that.
Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: Jul 02 14:23:22 2024 (1199283) Traceback (most recent call last): Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: File "/usr/lib/python3.9/site-packages/twisted/internet/defer.py", line 1697, in _inlineCallbacks Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: result = context.run(gen.send, result) Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: File "/usr/lib/python3.9/site-packages/fedora_messaging/twisted/factory.py", line 240, in publish Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: yield protocol.publish(message, exchange) Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: File "/usr/lib/python3.9/site-packages/twisted/internet/defer.py", line 1947, in unwindGenerator Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: return _cancellableInlineCallbacks(gen) Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: File "/usr/lib/python3.9/site-packages/twisted/internet/defer.py", line 1857, in _cancellableInlineCal> Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: _inlineCallbacks(None, gen, status, _copy_context()) Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: --- <exception caught here> --- Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: File "/usr/lib/python3.9/site-packages/fedora_messaging/api.py", line 259, in _twisted_publish Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: yield _twisted_service._service.factory.publish(message, exchange=exchange) Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: File "/usr/lib/python3.9/site-packages/fedora_messaging/twisted/factory.py", line 240, in publish Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: yield protocol.publish(message, exchange) Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: File "/usr/lib/python3.9/site-packages/twisted/internet/defer.py", line 1697, in _inlineCallbacks Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: result = context.run(gen.send, result) Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: File "/usr/lib/python3.9/site-packages/fedora_messaging/twisted/protocol.py", line 137, in publish Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: message.validate() Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: File "/usr/lib/python3.9/site-packages/fedora_messaging/message.py", line 508, in validate Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: jsonschema.validate(self.body, schema) Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: File "/usr/lib/python3.9/site-packages/jsonschema/validators.py", line 934, in validate Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: raise error Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: jsonschema.exceptions.ValidationError: None is not of type 'string' Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: Failed validating 'type' in schema['properties']['url']: Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: {'description': 'Where the message is archived', 'type': 'string'} Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: On instance['url']: Jul 02 14:23:22 mailman01.iad2.fedoraproject.org mailman3[1199283]: None
Asap as it's blocking the fedmsg archiver
After watching mailman for a while today, I found out that this happens when the archiver fails with error 502 when reaching http://localhost/archives/api/mailman/urls
I think this is related to high load, which is probably caused by the rebuild of mailman cache from scratch.
I will try to disable the timer that does that for now and see if the issue will still be there. If that will be the case I will create my own script for cache rebuild and hopefully that will be more effective and less hungry.
Even when the job for rebuilding the cache is disabled the errors are still happening. So this needs more investigation.
After some of the changes @kevin did yesterday to the machine I don't see any errors happening anymore. Everything seems to be running as it should.
I have one remaining PR that I will merge and monitor the mailman for a while, but it seems that all the remaining issues were resolved. Now we just need to wait for the index to finally rebuild itself.
Metadata Update from @zlopez: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Metadata Update from @zlopez: - Issue untagged with: Needs investigation - Issue tagged with: medium-trouble
Log in to comment on this ticket.