Hi Sage, Great, we will clean the code with several of our bug fixes and get back to you for your help soon. So far ,It runs well in our testing environment, we caught several bugs regarding to xattr/omap/data dismatch. We pretty much fixed all of them and planning to add several new testing cases. We are still running tests aggressively in bigger cluster to see whether there is any existing issues. If passed all of our aggressive testing bench ,we will transfer them into our production cluster for more observations. By the way, we have built Alibaba in house Teuthology with more testing cases including hardware injection/networking failure injection/network switch error injection/Server failure injection etc. Hopefully , it can pass all of the test in a week and cover all of the corners. Regards, James On 5/11/17, 10:44 AM, "Sage Weil" <sage@xxxxxxxxxxxx> wrote: On Fri, 12 May 2017, LIU, Fei wrote: > Hi Piotr, > Here you go. We just uploaded the slide into slideshare for your reference. Please feel free to let us know if you have any comments. > > https://www.slideshare.net/jupiturliu/ceph-recovery-improvement-v02 > > Regards, > James Hi James- This work is very promising! Putting the write info in the log is probably one of the easier pieces to tackle (and Josh is already looking a variation of the async recovery). If you have the time I'd love to resurrect that PR and get it into a mergeable state. Is there an open PR with the current code? sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html