Hi, since a couple of days we experience a strange slowness on some radosgw-admin operations. What is the best way to debug this? For example creating a user takes over 20s. [root@s3db1 ~]# time radosgw-admin user create --uid test-bb-user --display-name=test-bb-user 2021-05-05 14:08:14.297 7f6942286840 1 robust_notify: If at first you don't succeed: (110) Connection timed out 2021-05-05 14:08:14.297 7f6942286840 0 ERROR: failed to distribute cache for eu-central-1.rgw.users.uid:test-bb-user 2021-05-05 14:08:24.335 7f6942286840 1 robust_notify: If at first you don't succeed: (110) Connection timed out 2021-05-05 14:08:24.335 7f6942286840 0 ERROR: failed to distribute cache for eu-central-1.rgw.users.keys:**** { "user_id": "test-bb-user", "display_name": "test-bb-user", .... } real 0m20.557s user 0m0.087s sys 0m0.030s First I thought that rados operations might be slow, but adding and deleting objects in rados are fast as usual (at least from my perspective). Also uploading to buckets is fine. We changed some things and I think it might have to do with this: * We have a HAProxy that distributes via leastconn between the 3 radosgw's (this did not change) * We had three times a daemon with the name "eu-central-1" running (on the 3 radosgw's) * Because this might have led to our data duplication problem, we have split that up so now the daemons are named per host (eu-central-1-s3db1, eu-central-1-s3db2, eu-central-1-s3db3) * We also added dedicated rgw daemons for garbage collection, because the current one were not able to keep up. * So basically ceph status went from "rgw: 1 daemon active (eu-central-1)" to "rgw: 14 daemons active (eu-central-1-s3db1, eu-central-1-s3db2, eu-central-1-s3db3, gc-s3db12, gc-s3db13...) Cheers Boris _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx