On 2021/7/27 10:00, Jaegeuk Kim wrote:
On 07/25, Eric Biggers wrote:
On Sun, Jul 25, 2021 at 06:50:51PM +0800, Chao Yu wrote:
On 2021/7/16 22:39, Eric Biggers wrote:
From: Eric Biggers <ebiggers@xxxxxxxxxx>
f2fs_write_begin() assumes that all blocks were preallocated by
default unless FI_NO_PREALLOC is explicitly set. This invites data
corruption, as there are cases in which not all blocks are preallocated.
Commit 47501f87c61a ("f2fs: preallocate DIO blocks when forcing
buffered_io") fixed one case, but there are others remaining.
Could you please explain which cases we missed to handle previously?
then I can check those related logic before and after the rework.
Any case where a buffered write happens while not all blocks were preallocated
but FI_NO_PREALLOC wasn't set. For example when ENOSPC was hit in the middle of
the preallocations for a direct write that will fall back to a buffered write,
e.g. due to f2fs_force_buffered_io() or page cache invalidation failure.
Indeed, IIUC, the buggy code is as below, if any preallocation failed, we need to
set FI_NO_PREALLOC flag.
map_blocks:
err = f2fs_map_blocks(inode, &map, 1, flag);
if (map.m_len > 0 && err == -ENOSPC) {
if (!direct_io) <----
set_inode_flag(inode, FI_NO_PREALLOC);
err = 0;
}
BTW, it will be better to include above issue details you explained into commit
message?
- /*
- * If force_buffere_io() is true, we have to allocate
- * blocks all the time, since f2fs_direct_IO will fall
- * back to buffered IO.
- */
- if (!f2fs_force_buffered_io(inode, iocb, from) &&
- f2fs_lfs_mode(F2FS_I_SB(inode)))
- goto write;
We should keep this OPU DIO logic, otherwise, in lfs mode, write dio
will always allocate two block addresses for each 4k append IO.
I jsut test based on codes of last f2fs dev-test branch.
Yes, I had misread that due to the weird goto and misleading comment and
translated it into:
/* If it will be an in-place direct write, don't bother. */
if (dio && !f2fs_lfs_mode(sbi))
return 0;
It should be:
if (dio && f2fs_lfs_mode(sbi))
return 0;
Hmm, this addresses my 250 failure. And, I think the below commit can explain
the case.
commit 47501f87c61ad2aa234add63e1ae231521dbc3f5
Author: Jaegeuk Kim <jaegeuk@xxxxxxxxxx>
Date: Tue Nov 26 15:01:42 2019 -0800
f2fs: preallocate DIO blocks when forcing buffered_io
The previous preallocation and DIO decision like below.
allow_outplace_dio !allow_outplace_dio
f2fs_force_buffered_io (*) No_Prealloc / Buffered_IO Prealloc / Buffered_IO
!f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO
But, Javier reported Case (*) where zoned device bypassed preallocation but
fell back to buffered writes in f2fs_direct_IO(), resulting in stale data
being read.
In order to fix the issue, actually we need to preallocate blocks whenever
we fall back to buffered IO like this. No change is made in the other cases.
allow_outplace_dio !allow_outplace_dio
f2fs_force_buffered_io (*) Prealloc / Buffered_IO Prealloc / Buffered_IO
!f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO
Reported-and-tested-by: Javier Gonzalez <javier@xxxxxxxxxxx>
Signed-off-by: Damien Le Moal <damien.lemoal@xxxxxxx>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx>
Reviewed-by: Chao Yu <yuchao0@xxxxxxxxxx>
Reviewed-by: Javier González <javier@xxxxxxxxxxx>
Signed-off-by: Jaegeuk Kim <jaegeuk@xxxxxxxxxx>
Thanks for the explain.
Do you have a proper explanation for why preallocations shouldn't be done in
See commit f847c699cff3 ("f2fs: allow out-place-update for direct IO in LFS mode"),
f2fs_map_blocks() logic was changed to force allocating a new block address no matter
previous block address was existed if it is called from write path of DIO. So, in such
condition, if we preallocate new block address in f2fs_file_write_iter(), we will
suffer the problem which my trace indicates.
this case? Note that preallocations are still done for buffered writes, which
may be out-of-place as well; how are those different?
Got your concern.
For buffered IO, we use F2FS_GET_BLOCK_PRE_AIO, in this mode, we just preserve
filesystem block count and tag NEW_ADDR in dnode block, so, it's fine, double
new block address allocation won't happen during data page writeback.
For direct IO, we use F2FS_GET_BLOCK_PRE_DIO, in this mode, we will allocate
physical block address, after preallocation, if we fallback to buffered IO, we
may suffer double new block address allocation issue... IIUC.
Well, can we relocate preallocation into f2fs_direct_IO() after all cases which
may cause fallbacking DIO to buffered IO?
Thanks,
- Eric