On Wed, Nov 07, 2012 at 08:04:44AM +1100, Dave Chinner wrote: > On Tue, Nov 06, 2012 at 04:13:11PM +1100, Dave Chinner wrote: > > Hi folks, > > > > Fourth version of the buffer verifier series. The read verifier > > infrastructure is described here: > > > > http://oss.sgi.com/archives/xfs/2012-10/msg00146.html > > > > The second version with write verifiers is described here: > > > > http://oss.sgi.com/archives/xfs/2012-10/msg00280.html > > > > This version add write verifiers to all buffers that aren't directly > > read (i.e. via xfs_buf_get*() interfaces), and drops the log > > recovery verifiers from the series as it really needs more buffer > > item format flags to do relaibly. > > > > The seris is just about ready to go - it passes all of xfstests here > > except for 070. With the addition of the getbuf write verifiers, > > this series is now detecting a corrupt xfs_da_node buffer being > > written to disk. It appears to be a new symptom of known problem, > > as tracing indicates that the test is triggering the same double > > split/join pattern as described here: > > > > http://oss.sgi.com/archives/xfs/2012-03/msg00347.html > > So, 070 isn't hitting this exact problem - I think i have a handle > on the cause of the problem in the link now (i.e. I have a fix that > passes all of xfstests without any other problems arising), but the > reproducer is also causing the same write verifier failures as 070 > and 117. However, all three do a double leaf split operation, so > that's going to be the underlying cause of the verifier failure. They underlying cause is the fact that leaf format attribute tree format is unreliable when there are remote attributes. The detection is based on the being precisely one block at offset 0 in the attribute fork bmap btree, and when you add remote attributes that is no longer true, even though the root block of the attribute tree is still a leaf. Hence there is code in the node format detection that specifically handles leaf format trees when doing node format operations. This is how the xfs_da_node_buf_ops get attached to attribute leaf format buffers being read from disk - they pass verification because the da node format verifier sees the leaf magic number and calls the appropriate verifier instead. This issue was that the original code I wrote had the read verifier set the write verifier, so this act of calling the correct read verifier also set the write verifier correctly. Convert to an ops structure meant that this implicit rewrite of the write verifier no longer occurred, and boomy-boom-boom when the write verifier when the above situation occurs. I just posted a V2 patch for 22/22 that fixes this. Now all xfstests pass with the patch set. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs