On Fri, 9 Mar 2012, Alexandre Oliva wrote: > On Mar 3, 2012, Sage Weil <sage@xxxxxxxxxxxx> wrote: > > > It looks like the problem is that CInode::first isn't being journaled. > > Normally, that's fine because it matches the referring dentry.. but for > > multiversion inodes (like snapped directories), it won't match. On replay > > we end up with bad value of 2, and it re-cows and clobbers the original > > old value. > > > I pushed wip-1946 with a fix. Want to give it a go? > > Sorry about the delay, I spent the week facing disk full problems that > followed a major crushmap rearrangement and cluster_snaps that I'd > rather not remove before the rearrangement was complete. Fun! :-) > > I gave it a go, and I'm afraid it doesn't look like it fixed the > problem. bb85a7270 (wip-1946^) is the one patch I tested with, because > the subsequent patch in wip-1946 failed the assertion during recovery, > while replaying AFAICT a directory move. in->first was 2 (the bad value > you mention above?). You didn't by chance keep the log? Anyway, a defensive fix would be to replace that patch's - in->first = p->dnfirst; + assert(in->first == p->dnfirst || + (in->is_multiversion() && in->first > p->dnfirst)); with if (!in->is_multiversion()) in->first = p->dnfirst; > Observed behavior was still the same: remove old snapshot, touch dir > with old timestamp, remount, re-create snapshot, remount, check > timestamps in snapshot dir, all fine, restart mds, all fine, touch dir, > remount, all fine, restart mds again, snapshot timestamp changes to > match. Sorry I don't have more time to mess with this. FWIW I'd want to test any fix by running it through the snap workunits (particularly snaptest-2.sh). Those are probably a good smoke test for testing any changes in this area if it's tedious to reproduce your bug. sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html