On Friday 02 December 2011 wrote Sage Weil: > On Fri, 2 Dec 2011, Amon Ott wrote: > > On Thursday 01 December 2011 you wrote: > > > On all four nodes of my test cluster, MDS crashes with a trace like > > > that in bug #1047. Example and ceph.conf attached. Ceph server side is > > > from git master, last commit ce6572273943ffdca4b7dc5344152d6c35106a2d. > > > > > > MDS does not start on any node here, it reliably crashes with that > > > assert. > > > > Does it makes sense for you to keep the cluster in that broken state, so > > that we can reproduce that bug or test a potential fix? Otherwise, I > > would recreate the Ceph filesystem to make more tests. I also have a full > > log of one mds from start to crash here. > > Can you attach the log to #1047 for posterity? I'll take a quick look and > see if there is any further info to gain from the log. I'm guessing the > actual bug occured before the crash, when the anchor table wasn't updated > properly, but there may be clues here. Did you find some time to look into this? The bug makes Ceph unusable for us even with moderate load. All mds instances die with the same assert, the only way to recover in that state is to recreate the complete ceph fs and restore backups. Amon Ott -- Dr. Amon Ott m-privacy GmbH Tel: +49 30 24342334 Am Köllnischen Park 1 Fax: +49 30 24342336 10179 Berlin http://www.m-privacy.de Amtsgericht Charlottenburg, HRB 84946 Geschäftsführer: Dipl.-Kfm. Holger Maczkowsky, Roman Maczkowsky GnuPG-Key-ID: 0x2DD3A649 -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html