On Thu, 16 Nov 2006 00:49:20 -0800 Mingming Cao <cmm@xxxxxxxxxx> wrote: > On Wed, 2006-11-15 at 23:22 -0800, Andrew Morton wrote: > > On Wed, 15 Nov 2006 22:55:43 -0800 > > Mingming Cao <cmm@xxxxxxxxxx> wrote: > > > > > Hmm, maxblocks, in bitmap_search_next_usable_block(), is the end block > > > number of the range to search, not the lengh of the range. maxblocks > > > get passed to ext2_find_next_zero_bit(), where it expecting to take the > > > _size_ of the range to search instead... > > > > > > Something like this: (this is not a patch) > > > @@ -524,7 +524,7 @@ bitmap_search_next_usable_block(ext2_grp > > > ext2_grpblk_t next; > > > > > > - next = ext2_find_next_zero_bit(bh->b_data, maxblocks, start); > > > + next = ext2_find_next_zero_bit(bh->b_data, maxblocks-start + 1, start); > > > if (next >= maxblocks) > > > return -1; > > > return next; > > > } > > > > yes, the `size' arg to find_next_zero_bit() represents the number of bits > > to scan at `offset'. > > > > So I think your change is correctish. But we don't want the "+ 1", do we? > > > I think we still need the "+1", maxblocks here is the ending block of > the reservation window, so the number of bits to scan =end-start+1. > > > If we're right then this bug could cause the code to scan off the end of the > > bitmap. But it won't explain Hugh's bug, because of the if (next >= maxblocks). > > > > Yeah.. at first I thought it might be related, then, thinked it over, > the bug only makes the bits to scan larger, so if find_next_zero_bit() > returns something off the end of bitmap, that is fine, it just > indicating that there is no free bit left in the rest of bitmap, which > is expected behavior. So bitmap_search_next_usable_block() fail is the > expected. It will move on to next block group and try to create a new > reservation window there. I wonder why it's never oopsed. Perhaps there's always a zero in there for some reason. > That does not explain the repeated reservation window add and remove > behavior Huge has reported. I spent quite some time comparing with ext3. I'm a bit stumped and I'm suspecting that the simplistic porting the code is now OK, but something's just wrong. I assume that the while (1) loop in ext3_try_to_allocate_with_rsv() has gone infinite. I don't see why, but more staring is needed. What lock protects the fields in struct ext[234]_reserve_window from being concurrently modified by two CPUs? None, it seems. Ditto ext[234]_reserve_window_node. i_mutex will cover it for write(), but not for pageout over a file hole. If we end up with a zero- or negative-sized window then odd things might happen. > > btw, how come try_to_extend_reservation() uses spin_trylock? > > Since locks are all allocated from reservation window, when ext3 > multiple blocks allocation was added, we added try_to_extend_reservation > () to ext3, which trying to extend the reservation window size to at > least match the number of blocks to allocate. So we have better chance > to allocating multiple blocks from the window at a time. > > Since all the multiple block allocation is based on best effort basis, > the same applied to try_to_extend_reservation(). It seems no need to > wait for the reservation tree lock if it's not avaible at that moment. > I suspect that was not a good idea - it has added rather different behaviour in the once-in-a-million case. - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html