On Tue, Jul 12, 2011 at 07:14:19PM -0500, Eric Sandeen wrote: > On 7/12/11 7:12 PM, Dave Chinner wrote: > > On Tue, Jul 12, 2011 at 05:03:38PM -0500, Eric Sandeen wrote: > >> Sending this for review prior to stable submission... > >> > >> A user on #xfs reported that a log replay was oopsing in > >> __rb_rotate_left() with a null pointer deref. > >> > >> I traced this down to the fact that in xfs_alloc_busy_insert(), > >> we erased a node with rb_erase() when the new node overlapped, > >> but left it specified as the parent node for the new insertion. > >> > >> So when we try to insert a new node with an erased node as > >> its parent, obviously things go very wrong. > >> > >> Upstream, > >> 97d3ac75e5e0ebf7ca38ae74cebd201c09b97ab2 xfs: exact busy extent tracking > >> actually fixed this, but as part of a much larger change. Here's > >> the relevant bit: > >> > >> * We also need to restart the busy extent search from the > >> * tree root, because erasing the node can rearrange the > >> * tree topology. > >> */ > >> rb_erase(&busyp->rb_node, &pag->pagb_tree); > >> busyp->length = 0; > >> return false; > >> > >> We can do essentially the same thing to older codebases by restarting > >> the search after the erase. > >> > >> This should apply to .35 through .39, and was tested on .39 > >> with the oopsing replay reproducer. > >> > >> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx> > >> --- > >> > >> Index: linux-2.6/fs/xfs/xfs_alloc.c > >> =================================================================== > >> --- linux-2.6.orig/fs/xfs/xfs_alloc.c > >> +++ linux-2.6/fs/xfs/xfs_alloc.c > >> @@ -2664,6 +2664,12 @@ restart: > >> new->bno + new->length) - > >> min(busyp->bno, new->bno); > >> new->bno = min(busyp->bno, new->bno); > >> + /* > >> + * Start the search over from the tree root, because > >> + * erasing the node can rearrange the tree topology. > >> + */ > >> + spin_unlock(&pag->pagb_lock); > >> + goto restart; > >> } else > >> busyp = NULL; > > > > Looks good. > > > > I'm guessing that the only case I was able to hit during testing of > > this code originally was the "overlap with exact start block match", > > otherwise I would have seen this. I'm not sure that there really is > > much we can do to improve the test coverage of this code, though. > > Hell, just measuring our test coverage so we know what we aren't > > testing would probably be a good start. :/ > > Apparently the original oops, and the subsequent replay oopses, > were on a filesystem VERY busy with torrents. > > Might be a testcase ;) That just means large files. And fragmentation levels are effectively dependent on whether the torrent client uses preallocation or not. Just creating a set of large fragmented file using preallocation, shutting the filesystem down in the middle of it and then doing log replay might do the trick... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs