There was a patch at some point to pre-split on pg creation (merged in ad6a2be402665215a19708f55b719112096da3f4). More generally, bluestore is the answer to this. -Sam On Tue, Feb 9, 2016 at 11:34 AM, Lionel Bouton <lionel-subscription@xxxxxxxxxxx> wrote: > Le 09/02/2016 20:18, Lionel Bouton a écrit : >> Le 09/02/2016 20:07, Kris Jurka a écrit : >>> >>> On 2/9/2016 10:11 AM, Lionel Bouton wrote: >>> >>>> Actually if I understand correctly how PG splitting works the next spike >>>> should be <n> times smaller and spread over <n> times the period (where >>>> <n> is the number of subdirectories created during each split which >>>> seems to be 15 according to OSDs' directory layout). >>>> >>> I would expect that splitting one directory would take the same amount >>> of time as it did this time, it's just that now there will be N times >>> as many directories to split because of the previous splits. So the >>> duration of the spike would be quite a bit longer. >> Oops I missed this bit, I believe you are right: the spike duration >> should be ~16x longer but the slowdown roughly the same over this new >> period :-( > > As I don't see any way around this, I'm thinking out of the box. > > As splitting is costly for you you might want to try to avoid it (or at > least limit it to the first occurrence if your use case can handle such > a slowdown). > You can test increasing the PG number of your pool before reaching the > point where the split starts. > This would generate movements but this might (or might not) slow down > your access less than what you see when splitting occurs (I'm not sure > about the exact constraints but basically Ceph forces you to increase > the number of placement PG by small amounts which should limit the > performance impact). > > Another way to do this with no movement and slowdown is to add pools > (which basically create new placement groups without rebalancing data) > but this means modifying your application so that new objects are stored > on the new pool (which may or may not be possible depending on your > actual access patterns). > > There are limits to these 2 suggestions : increasing the number of > placement groups have costs so you might want to check with devs how > high you can go and if it fits your constraints. > > Lionel. > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com