Re: [PATCH] fix bmap-vs-truncate race

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Wed, 1 Apr 2009, Al Viro wrote:

> On Tue, Mar 31, 2009 at 06:42:34PM -0400, Mikulas Patocka wrote:
> > 
> > There is a lot of text about directories, but nothing about locking of 
> > block mappings.
> > 
> > I was living under an impression that get_block() cannot be called on a 
> > block that is being truncated. That's what read/write/direct-io vs 
> > truncate seems to guarante --- truncate will first lower i_size 
> > (preventing any new pages past i_size from being created), then destroy 
> > any existing pages past i_size (that includes waiting for pagelock until 
> > all get_blocks on that page end) and finally truncate the metadata on the 
> > filesystem.
> > 
> > So there should be no situation when you truncate block and call get_block 
> > on it simultaneously. If get_block can race with truncate, document it.
> > 
> > There are filesystems that don't do any locking on get_block() (for 
> > example UFS, HPFS; FAT does it only for bmap and doesn't do it for general 
> > accesses) and other filesystems verify indirect block chains obsessively 
> > if they were truncated under get_block (why? because of bmap? or some 
> > other possibility?) --- so the rules should really be documented.
> 
> Indirect chain stuff used to be [1] about truncate that *wouldn't* affect page
> in question.  Look: we have e.g. 4Kb blocks and data at offset 80Kb.  We do
> allocation at offset 40Kb *and* truncate to 60Kb at the same time.
> 
> Both 40Kb (block 10) and 80Kb (block 20) are covered by the first indirect
> block.  It's there, so get_block() reads it and gets ready to allocate
> a block and put its number in the very beginning of indirect block.  In
> the meanwhile, truncate() sees that the boundary falls within the first
> indirect block (at entry 15).  It sees that we have no blocks prior to
> that, so the indirect block ought to be freed.
> 
> Now ext2_get_block() comes back with allocated data block and has nowhere
> to stick it anymore - indirect one just got freed.

I see. So if we change ext2_truncate to not delete indirect blocks that 
map only partially truncated space, we could drop that verify_chanin().

Upside: get rid of up to 3 spinlocks & associated cache bounce from every 
get_block call.

Downside: truncate with sparse files would occasionally produce empty 
indirect block. Is it legal to have indirect block full of zero pointers 
on ext2? Or would fsck complain about it?

> _That_ is where verify_chain() came from.  As far as anything outside of
> ext2 can know, this truncate() won't come anywhere near the page we are
> working with.  And it won't - for data, that is.

True. Except that bmap case. Bmap should be either documented or fixed 
with my proposed patch.

> Disclaimer: this code has been changed several times since the last time
> I worked with it, so this might not match the current situation anymore.
> 
> [1] see disclaimer above.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux