On Sun 20-10-19 21:38:42, Theodore Y. Ts'o wrote: > On Fri, Oct 04, 2019 at 12:05:51AM +0200, Jan Kara wrote: > > Similarly to directories, EA inodes do only journalled modifications to > > their data. Change ext4_should_journal_data() to return true for them so > > that we don't have to special-case them during truncate. > > We are already special-casing EA inodes in ext4_clear_blocks() in > fs/ext4/indirect.c, and get_default_free_blocks_flags() in > fs/ext4/extents.c, and like S_ISDIR, we want to treat EA inode blocks > as metadata. So I'm not sure I see the value of this change? Firstly, ext4_should_journal_data() should tell whether inode's data blocks are modified through journalling. So as a principle of least surprise it should return true for EA inodes because that's how data blocks of those inodes are modified. Secondly, once ext4_should_journal_data() is fixed by this patch, I think that we can just drop that special-casing from ext4_clear_blocks() and get_default_free_blocks_flags() and just have there: if (ext4_should_journal_data(inode)) flags |= EXT4_FREE_BLOCKS_FORGET; > As an aside, I was looking at fs/ext4/mballoc.c to see what the > difference is for treating a block as a metadata block versus a > journaled data block, and what I found made my hair rise on end: > > /* > * We need to make sure we don't reuse the freed block until after the > * transaction is committed. We make an exception if the inode is to be > * written in writeback mode since writeback mode has weak data > * consistency guarantees. > */ > > So in data=writeback, if a file is deleted, its blocks are available > for immediate reallocation, and if we are under heavy memory pressure, > the deleted file's blocks could get overwritten --- even in the case > where we crash and the transaction never committed. > > While it's true that date=writeback mode has weaker guarantees, my > understanding is that it only applied to the exposure stale data, and > not to a long-standing file's blocks getting corrupted if it is almost > deleted, but not quite before a crash. > > Granted, the situation where this would happen is quite wrare, but it > seems quite wrong.... I've always considered data=writeback as: You don't know what the data is going to be if the file was touched shortly before crashing (i.e., similar to old ext2 non-guarantees). Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR