Eric Sandeen wrote: > This is for Red Hat bug 490026, > EXT4 panic, list corruption in ext4_mb_new_inode_pa > > (this was on backported ext4 from 2.6.29) > > We hit a BUG() in __list_add from ext4_mb_new_inode_pa() > because the list head pointed to a removed item: > > list_add corruption. next->prev should be ffff81042f2fe158, > but was 0000000000200200 > > (0000000000200200 is LIST_POISON2, set when the item is deleted) > > ext4_lock_group(sb, group) is supposed to protect this list for > each group, and a common code flow is this: > > ext4_get_group_no_and_offset(sb, pa->pa_pstart, &grp, NULL); > ext4_lock_group(sb, grp); > list_del(&pa->pa_group_list); > ext4_unlock_group(sb, grp); > > so its critical that we get the right group number back for > this pa->pa_pstart block. > > however, ext4_mb_put_pa passes in (pa->pa_pstart - 1) with a > comment, "-1 is to protect from crossing allocation group" > > Other list-manipulators do not use the "-1" so we have the > potential to lock the wrong group and race. Given how the > ext4_get_group_no_and_offset() function works, it doesn't seem > to me that the subtraction is correct. Hm, unless pa_pstart gets advanced to the point where it's in the next group when it's used up... might be more reading to do here. -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html