Re: [RFC 2/3] ext2: Convert ext2 regular file buffered I/O to use iomap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2023/11/30 12:30, Christoph Hellwig wrote:
> On Thu, Nov 30, 2023 at 08:54:31AM +0530, Ritesh Harjani wrote:
>> So I took a look at this. Here is what I think -
>> So this is useful of-course when we have a large folio. Because
>> otherwise it's just one block at a time for each folio. This is not a
>> problem once FS buffered-io handling code moves to iomap (because we
>> can then enable large folio support to it).
> 
> Yes.
> 
>> However, this would still require us to pass a folio to ->map_blocks
>> call to determine the size of the folio (which I am not saying can't be
>> done but just stating my observations here).
> 
> XFS currently maps based on the underlyig reservation (delalloc extent)
> and not the actual map size.   This works because on-disk extents are
> allocated as unwritten extents, and only the actual written part is
> the converted.  But if you only want to allocate blocks for the part

IIUC, I noticed a side effect of this map method, Let's think about a
special case, if we sync a partial range of a delalloc extent, and then
the system crashes before the remaining data write back. We could get a
file which has allocated blocks(unwritten) beyond EOF. I can reproduce
it on xfs.

 # write 10 blocks, but only sync 3 blocks, i_size becomes 12K after IO complete
 xfs_io -f -c "pwrite 0 40k" -c "sync_range 0 12k" /mnt/foo
 # postpone the remaining data writeback
 echo 20000 > /proc/sys/vm/dirty_writeback_centisecs
 echo 20000 > /proc/sys/vm/dirty_expire_centisecs
 # wait 35s to make sure xfs's cil log writeback
 sleep 35
 # triger system crash
 echo c > /proc/sysrq-trigger

After system reboot, the 'foo' file's size is 12K but have 10
allocated blocks.

 xfs_db> inode 131
 core.size = 12288
 core.nblocks = 10
 ...
 0:[0,24,3,0]
 1:[3,27,7,1]

I'm not expert on xfs, but I don't think this is the right result,
is it?

Thanks,
Yi.

> actually written you actually need to pass in the dirty range and not
> just use the whole folio.  This would be the incremental patch to do
> that:
> 
> http://git.infradead.org/users/hch/xfs.git/commitdiff/0007893015796ef2ba16bb8b98c4c9fb6e9e6752
> 
> But unless your block allocator is very cheap doing what XFS does is
> probably going to work much better.
> 
>> ...ok so here is the PoC for seq counter check for ext2. Next let me
>> try to see if we can lift this up from the FS side to iomap - 
> 
> This looks good to me from a very superficial view.  Dave is the expert
> on this, though.
> 
> 
> .
> 




[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux