On Mon 12-07-10 17:58:46, Lukas Czerner wrote: > On Mon, 12 Jul 2010, Jan Kara wrote: > > > > Walk through each allocation group and trim all free extents. It can be > > > invoked through TRIM ioctl on the file system. The main idea is to > > > provide a way to trim the whole file system if needed, since some SSD's > > > may suffer from performance loss after the whole device was filled (it > > > does not mean that fs is full!). > > > > > > It search for free extents in each allocation group. When the free > > > extent is found, blocks are marked as used and then trimmed. Afterwards > > > these blocks are marked as free in per-group bitmap. > > > > > > Signed-off-by: Lukas Czerner <lczerner@xxxxxxxxxx> > > > --- > > > fs/ext3/balloc.c | 145 +++++++++++++++++++++++++++++++++++++++++++++++ > > > fs/ext3/super.c | 1 + > > > include/linux/ext3_fs.h | 1 + > > > 3 files changed, 147 insertions(+), 0 deletions(-) > > > > > > diff --git a/fs/ext3/balloc.c b/fs/ext3/balloc.c > > > index a177122..bcee525 100644 > > > --- a/fs/ext3/balloc.c > > > +++ b/fs/ext3/balloc.c > > ... > > > + /** > > > + * Allocate contiguous free extents by setting bits in the > > > + * block bitmap > > > + */ > > > + while (next < max > > > + && !ext3_set_bit_atomic(sb_bgl_lock(sbi, group), > > > + next, bh->b_data)) { > > > + next++; > > > + } > > This is actually wrong. You completely ignore journalling here. You can't > > just go and modify metadata buffer - other process can be modifying it as well > > and writing it to disk and thus your changes will also get written. And if > > a crash happens afterwards before the bitmap is written again, you'll get an > > inconsistent filesystem. > > Also you have to check whether the block isn't actually still used by a > > running/committing transaction - look at fs/ext3/balloc.c:claim_block() to see > > how you have to allocate free blocks. > > I may be wrong, but I thought that since the trim command ensures that > every operation in queue completes before the trim proceed, I do not > need to care much about the journaling and running transaction. But I > will took at it once more.. Consider just a simple race: thread A: thread B: allocate blocks in group G set bits for free blocks in group G transaction with allocation commits - bitmap has bits from thread B set ----------------------------------------------- crash After a journal replay we have just leaked blocks set in the bitmap by thread B... And there are probably races with worse consequences. This is just the simplest one. Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html