Hello!
In our cluster we had a nasty problem recently due to a very large
number of buckets for a single RadosGW user.
The bucket limit was disabled earlier, and the number of buckets grew
to the point where OSDs started to go down due to excessive access
times, missed heartbeats etc.
We have since rectified that problem by first raising the relevant
timeouts to near ridiculous levels so we could get the system to
respond again and by copying all data from that single user to a few
hundred new users. Of course, the old "gigantic" user is still around.
Not sure if this is relevant, but we also have quite a few snapshots on
the rgw pools.
We are now hesitant to delete the problematic user, because we're not
sure how this is implemented. Will deleting the user iterate its
buckets and delete those one by one? If so, we would be in trouble,
because anything but reading from that users' buckets is a good way to
get processes to crash / timeout again. If it does it at a lower level,
do we need to expect the snapshots to cause trouble? Either now, or
when we finally get around to throw out old ones?
So before we know more about what the implemention does (we're
currently on Hammer 0.94.1) we won't touch that user, but we would like
to get rid of it and the space it is wasting.
Thanks a lot in advance!
Daniel
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com