On Wed, 18 Jul 2018, Linh Vu wrote: > Thanks for all your hard work in putting out the fixes so quickly! :) > > We have a cluster on 12.2.5 with Bluestore and EC pool but for CephFS, > not RGW. In the release notes, it says RGW is a risk especially the > garbage collection, and the recommendation is to either pause IO or > disable RGW garbage collection. > > In our case with CephFS, not RGW, is it a lot less risky to perform the > upgrade to 12.2.7 without the need to pause IO? It is hard to quantify. I think we only saw the problem with RGW, but CephFS also sends deletes to non-existent objects when deleting or truncating sparse files. Those are probably not too common in most environments... > What does pause IO do? Do current sessions just get queued up and IO > resume normally with no problem after unpausing? Exactly. As long as the application doesn't have some timeout coded where it gives up when a read or write is taking to long, everything will just pause. > If we have to pause IO, is it better to do something like: pause IO, > restart OSDs on one node, unpause IO - repeated for all the nodes > involved in the EC pool? Yes, that sounds like a great way to proceed! sage _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com