On Sat, Nov 22, 2008 at 09:49:11PM -0500, Theodore Tso wrote: > On Sat, Nov 22, 2008 at 03:46:25PM -0500, Theodore Tso wrote: > > On Fri, Nov 21, 2008 at 10:10:46PM +0530, Aneesh Kumar K.V wrote: > > > Indicate that the group locks can be taken in loop. > > > > I've been looking at this patch more closely, and I think there's a > > major problem here. > > OK, after looking at this in yet more detail (and having changed > planes in Dallas :-), I am more than ever convinced this patch is not > rightq. We have an rw_sem for each block group, grp->alloc_sem, which > is allocated in groups of meta blockgroups. The whole reason why we > should worry about keeping them in the same class is we should worry > about is if for some reason, the multiblock allocator happens to > allocate two block group's alloc_sem, but one does them out of order > (say, bg 4, then bg 2, while another does bg 2, then 4), we would get > a dead lock. > > I'm guessing that what caused the problem for you was > ext4_mb_init_group(), which if you are using 1k filesystems, tries to > grab multiple grp->alloc_sem's. In each place where we find those, we > need to use down_write_nested --- see Documentation/lockdep-design.txt. Correct > > If there are any other places in mballoc.c which grabs multiple > alloc_sem's at the same time, we'll have to use define new subclasses. No. That is the only call site. How about the below patch. We can have more than 2 groups in a page depending on the page size and blocksize. So instead of using single_depth I guess we should use the relative group number ?. diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 1fa311c..891ce41 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -1783,7 +1783,7 @@ static int ext4_mb_init_group(struct super_block *sb, ext4_group_t group) * no block allocation going on in any * of that groups */ - down_write(&grp->alloc_sem); + down_write_nested(&grp->alloc_sem, i); } /* * make sure we look at only those groups -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html