On Fri, Jul 30, 2021 at 06:05:59PM -0700, Jaegeuk Kim wrote: > On 07/30, Eric Biggers wrote: > > On Fri, Jul 30, 2021 at 03:12:15PM -0700, Jaegeuk Kim wrote: > > > On 07/30, Eric Biggers wrote: > > > > On Tue, Jul 27, 2021 at 06:51:54PM -0700, Eric Biggers wrote: > > > > > From: Eric Biggers <ebiggers@xxxxxxxxxx> > > > > > > > > > > Currently, non-overwrite DIO writes are fundamentally unsafe on f2fs as > > > > > they require preallocating blocks, but f2fs doesn't support unwritten > > > > > blocks and therefore has to preallocate the blocks as regular blocks. > > > > > f2fs has no way to reliably roll back such preallocations, so as a > > > > > > Hmm, I'm still wondering why this becomes a problem. And, do we really need > > > to roll back the preallocated blocks? > > > > > > > > result, f2fs will leak uninitialized blocks to users if a DIO write > > > > > doesn't fully complete. This can be easily reproduced by issuing a DIO > > > > > write that will fail due to misalignment, e.g.: > > > > > > If there's any error, truncating blocks having NEW_ADDR could address this? > > > > > > > My understanding is that the "NEW_ADDR" block address in f2fs means that space > > was reserved for the block, but not allocated in any particular place yet. > > Buffered writes reserve blocks in this way, but DIO writes cannot because DIO by > > definition has to directly write to a specific on-disk location. Therefore DIO > > writes require that the blocks be preallocated for real. > > Sorry, checking back the DIO flow, we do allocate real block addresses if DIO > has holes. > > f2fs_preallocate_blocks > -> f2fs_map_blocks(F2FS_GET_BLOCK_PRE_DIO) > -> __allocate_data_block() > -> f2fs_allocate_data_block() gets a free LBA > > Then, back to your concern, do we need to truncate blocks beyond i_size, if we > meet any failure? That isn't enough because an allocating write is not necessarily an extending write; it may be filling holes. Also to be power-fail safe, the preallocations must not be committed to disk at all until the write has completed (maybe that's already the case in f2fs, but it's not clear to me). - Eric