On Sun, Oct 26, 2014 at 7:40 AM, Haomai Wang <haomaiwang@xxxxxxxxx> wrote: > On Sun, Oct 26, 2014 at 3:12 AM, Andrey Korolyov <andrey@xxxxxxx> wrote: >> Thanks Haomai. Turns out that the master` recovery is too buggy right >> now (recovery speed degrades over a time, OSD (non-kv) is going out of >> cluster with no reason, misplaced object calculation is wrong and so >> on), so I am sticking to giant with rocksdb now. So far no major >> problems are revealed. > > Hmm, do you mean kvstore has problem on osd recovery? I'm eager to > know the operations about how to produce this situation. Could you > give more detail? > > > > -- > Best Regards, > > Wheat I`m not sure if kv has triggered any of those, it`s just a side effect of deploying master branch (and OSDs which showed problems was not in kv subset only). Looks like both giant and master are exposing some problem with pg recalculation on tight-IO conditions for MON (MONs are sharing disk with one of OSD each and post-peering recalculation may take some minutes when kv-based OSDs are involved, also recalculation from active+remapped -> active+degraded(+...) takes tens of minutes; the same 'non-optimal' setup worked well before and all recalculations was made in a matter of tens of seconds, so I will investigate this a bit later). Giant crashed on non-KV daemons during nightly recovery, so there is a more critical stuff to fix right now because kv so far did not exposed any crashes by itself. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com