I am getting repeated fedora-notif messages about the same email I sent to a thread on fedora-devel 2 days ago. At this point I have received 206 such notifications regarding 2 posts to the same thread.
On the devel list, jforbes replied to 'Re: F37 kernel 6.0.16/6.0.18 breaking Python tests: Allows to bind a socket twice' a day ago https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/X7YLNMINUTAJNC6NQSS5VNN6MBLPCRDV/
On the devel list, jforbes replied to 'Re: F37 kernel 6.0.16/6.0.18 breaking Python tests: Allows to bind a socket twice' a day ago https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/RYLUA3JFKLCCBUPLNWJSFMHPOHW3XTN4/
@zlopez could you take a look? I don't understand why it's not unquring those after it sends... or where it keeps getting them. ;(
I restarted the workers, just on the off chance it was a sync issue between one of them and the delivery backend.
Can you let us know in a few hours if you have gotten more of these?
I just got 2 more copies of each message, so 4 total at 17:18 CST, so after the restart
Still getting them this morning, up to 274 notifications now.
Metadata Update from @zlopez: - Issue priority set to: Waiting on Assignee (was: Needs Review) - Issue tagged with: Needs investigation
After some digging, I found out the reason, why it's being resend by FMN
Jan 16 11:36:02 notifs-backend01.iad2.fedoraproject.org celery[495842]: [2023-01-16 11:36:02][ fmn.tasks INFO] Found 2 recipients for message 2023-77eacdd6-8d8c-4d7b-8f73-b2a9f9a10f9b Jan 16 11:36:02 notifs-backend01.iad2.fedoraproject.org celery[495842]: [2023-01-16 11:36:02][ fmn.tasks INFO] Dispatching messages for 1 recipients for the irc backend Jan 16 11:36:02 notifs-backend01.iad2.fedoraproject.org celery[495842]: [2023-01-16 11:36:02][ fmn.tasks INFO] Queuing message for delivery to jforbes.id.fedoraproject.org on the irc backend Jan 16 11:36:02 notifs-backend01.iad2.fedoraproject.org celery[495842]: [2023-01-16 11:36:02][ fmn.tasks INFO] Dispatching messages for 1 recipients for the email backend Jan 16 11:36:02 notifs-backend01.iad2.fedoraproject.org celery[495842]: [2023-01-16 11:36:02][celery.app.trace INFO] Task fmn.tasks.find_recipients[3522b079-14c4-4d9c-aaf0-b573fd6966c1] retry: Retry in 3600s: ValueError('Header value may not contain linefeed or carriage return characters')
It seems there is some issue with format of the e-mail header.
Metadata Update from @zlopez: - Issue untagged with: Needs investigation - Issue priority set to: Needs Review (was: Waiting on Assignee)
Metadata Update from @zlopez: - Issue priority set to: Waiting on Assignee (was: Needs Review) - Issue tagged with: dev, low-gain, medium-trouble
Looking into FMN code, there is a hardcoded retry for 60 days. I will try to remove the message from the retry queue, obviously it's delivered. Hopefully I will find a way to remove it without clearing the whole queue.
Metadata Update from @zlopez: - Issue untagged with: dev - Issue tagged with: ops
I found out that you can delete message from the rabbitmq queue by using rabbitmq API and curl:
curl -i -u user:password -X POST http://localhost:15672/api/queues/vhost/fmb.tasks.unprocessed_messages/get -d '{"count":1,"ackmode":"ack_requeue_false","encoding":"auto","truncate":50000}'
This needs to be run on notifs-backend01.iad2.fedoraproject.org and it consumes one message from the queue without putting it back.
notifs-backend01.iad2.fedoraproject.org
The issue is that I can't find the user:password for rabbitmq on notifs-backend01.iad2.fedoraproject.org.
user:password
I'm also not sure, if this queue is the right one, without user I can't check, but I know how to do it:
curl -i -u user:password -X POST http://localhost:15672/api/queues/vhost/fmb.tasks.unprocessed_messages/get -d '{"count":5,"ackmode":"ack_requeue_true","encoding":"auto","truncate":50000}'
This will just show you the messages and puts them back in queue. The important argument here is ackmode, which controls that.
ackmode
@kevin Could you point me to user and password for rabbitmq user on FMN machine? I tried the default one for rabbitmq, but got Access refused.
Access refused
Listing users ... user tags guest [administrator]
ie, we have setup no user... there's only 'guest'. It's only listening to localhost I think...
Not sure if anything new has been done, but definitely still getting the messages, up to 526 now.
@kevin I already tried the default guest login and got 401.
You need to pass "http://localhost:15672/api/queues/%2F/fmn.tasks.unprocessed_messages/get"
for / vhost (default).
If I do that I get a 200...but it returns [] ;(
I tried a purge_queue on it, but it didn't work...
Purging queue 'fmn.tasks.unprocessed_messages' in vhost '/' ...
fmn.tasks.unprocessed_messages 4
Do we need to stop backend to purge those?
I stopped backend and workers and purged the queue and restarted.
@jforbes can you let us know if it stops now?
This does seem to be resolved. Thank you!
Hurray. Sorry for all the trouble here.
Metadata Update from @kevin: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.