Hi, we just set up 2 new ceph clusters (using rook). To do some processing of the user activity we configured a topic that sends events to Kafka. After 5-12 hours this stops working with a 503 SlowDown response: debug 2024-08-02T09:17:58.205+0000 7ff4359ad700 1 req 13681579273117692719 0.005000019s ERROR: failed to reserve notification on queue: private.rgw. error: -28 First thought would be that the queue is full but up to this point see messages coming into Kafka and without much activity on the RGW itself (only a few requests against the S3 API) so it can’t be a load issue. What helps is to remove the notification configuration on the buckets (put-bucket-notification-configuration). If we directly re-add the previous notification configuration it also continuous working for a few hours before failing again with the same error/behaviour. We haven’t been able to reproduce this if we disable persistence for the topic so it looks like it is related to the persistence option - otherwise there would be also no queuing of the event for sending to Kafka. This also suggests that the issue is not with Kafka - this is also what we suspected first e.g. it can’t handle the amount of messages etc. Does anyone else have or had this issue and found the cause or a suggestion on how to best continue debugging? Are there detailed metrics etc. on the size and usage of the event queue? Here is the configuration for the topic and for a bucket: $ radosgw-admin topic list { "topics": [ { "user": "", "name": "private.rgw", "dest": { "push_endpoint": "kafka://rgw-sasl-kafka-user:XXX@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:9094/private.rgw?sasl.mechanism=SCRAM-SHA-512&mechanism=SCRAM-SHA-512", "push_endpoint_args": "OpaqueData=&Version=2010-03-31&kafka-ack-level=broker&persistent=false&push-endpoint=kafka://rgw-sasl-kafka-user:XXX@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:9094/private.rgw?sasl.mechanism=SCRAM-SHA-512&mechanism=SCRAM-SHA-512&use-ssl=true&verify-ssl=true", "push_endpoint_topic": "private.rgw", "stored_secret": true, "persistent": true }, "arn": "arn:aws:sns:ceph-objectstore::private.rgw", "opaqueData": "" } ] } $ aws s3api get-bucket-notification-configuration --bucket=XXX { "TopicConfigurations": [ { "Id": “my-id", "TopicArn": "arn:aws:sns:ceph-objectstore::private.rgw", "Events": [ "s3:ObjectCreated:*", "s3:ObjectRemoved:*" ] } ] } Thank you for any input to solve this! Cheers, Florian _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx