Good call. I just restarted the whole cluster, but the problem still persists. I don't think it is a problem with the rados, but with the radosgw. But I still struggle to pin the issue. Am Di., 11. Mai 2021 um 10:45 Uhr schrieb Thomas Schneider < Thomas.Schneider-q2p@xxxxxxxxxxxxxxxxxx>: > Hey all, > > we had slow RGW access when some OSDs were slow due to an (to us) unknown > OSD bug that made PG access either slow or impossible. (It showed itself > through slowness of the mgr as well, but nothing other than that). > We restarted all OSDs that held RGW data and the problem was gone. > I have no good way to debug the problem since it never occured again after > we restarted the OSDs. > > Kind regards, > Thomas > > > Am 11. Mai 2021 08:47:06 MESZ schrieb Boris Behrens <bb@xxxxxxxxx>: > >Hi Amit, > > > >I just pinged the mons from every system and they are all available. > > > >Am Mo., 10. Mai 2021 um 21:18 Uhr schrieb Amit Ghadge < > amitg.b14@xxxxxxxxx>: > > > >> We seen slowness due to unreachable one of them mgr service, maybe here > >> are different, you can check monmap/ ceph.conf mon entry and then verify > >> all nodes are successfully ping. > >> > >> > >> -AmitG > >> > >> > >> On Tue, 11 May 2021 at 12:12 AM, Boris Behrens <bb@xxxxxxxxx> wrote: > >> > >>> Hi guys, > >>> > >>> does someone got any idea? > >>> > >>> Am Mi., 5. Mai 2021 um 16:16 Uhr schrieb Boris Behrens <bb@xxxxxxxxx>: > >>> > >>> > Hi, > >>> > since a couple of days we experience a strange slowness on some > >>> > radosgw-admin operations. > >>> > What is the best way to debug this? > >>> > > >>> > For example creating a user takes over 20s. > >>> > [root@s3db1 ~]# time radosgw-admin user create --uid test-bb-user > >>> > --display-name=test-bb-user > >>> > 2021-05-05 14:08:14.297 7f6942286840 1 robust_notify: If at first > you > >>> > don't succeed: (110) Connection timed out > >>> > 2021-05-05 14:08:14.297 7f6942286840 0 ERROR: failed to distribute > >>> cache > >>> > for eu-central-1.rgw.users.uid:test-bb-user > >>> > 2021-05-05 14:08:24.335 7f6942286840 1 robust_notify: If at first > you > >>> > don't succeed: (110) Connection timed out > >>> > 2021-05-05 14:08:24.335 7f6942286840 0 ERROR: failed to distribute > >>> cache > >>> > for eu-central-1.rgw.users.keys:**** > >>> > { > >>> > "user_id": "test-bb-user", > >>> > "display_name": "test-bb-user", > >>> > .... > >>> > } > >>> > real 0m20.557s > >>> > user 0m0.087s > >>> > sys 0m0.030s > >>> > > >>> > First I thought that rados operations might be slow, but adding and > >>> > deleting objects in rados are fast as usual (at least from my > >>> perspective). > >>> > Also uploading to buckets is fine. > >>> > > >>> > We changed some things and I think it might have to do with this: > >>> > * We have a HAProxy that distributes via leastconn between the 3 > >>> radosgw's > >>> > (this did not change) > >>> > * We had three times a daemon with the name "eu-central-1" running > (on > >>> the > >>> > 3 radosgw's) > >>> > * Because this might have led to our data duplication problem, we > have > >>> > split that up so now the daemons are named per host > (eu-central-1-s3db1, > >>> > eu-central-1-s3db2, eu-central-1-s3db3) > >>> > * We also added dedicated rgw daemons for garbage collection, because > >>> the > >>> > current one were not able to keep up. > >>> > * So basically ceph status went from "rgw: 1 daemon active > >>> (eu-central-1)" > >>> > to "rgw: 14 daemons active (eu-central-1-s3db1, eu-central-1-s3db2, > >>> > eu-central-1-s3db3, gc-s3db12, gc-s3db13...) > >>> > > >>> > > >>> > Cheers > >>> > Boris > >>> > > >>> > >>> > >>> -- > >>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend > im > >>> groüen Saal. > >>> _______________________________________________ > >>> ceph-users mailing list -- ceph-users@xxxxxxx > >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >>> > >> > > > > -- > Thomas Schneider > IT.SERVICES > Wissenschaftliche Informationsversorgung Ruhr-Universität Bochum | 44780 > Bochum > Telefon: +49 234 32 23939 > http://www.it-services.rub.de/ > -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx