On Fri 22-09-23 17:33:59, Ritesh Harjani wrote: > Jan Kara <jack@xxxxxxx> writes: > > > On Wed 20-09-23 17:08:19, Ritesh Harjani wrote: > >> Jan Kara <jack@xxxxxxx> writes: > >> > >> > Hello! > >> > > >> > On Tue 19-09-23 14:00:04, Gao Xiang wrote: > >> >> Our consumer reports a behavior change between pre-iomap and iomap > >> >> direct io conversion: > >> >> > >> >> If the system crashes after an appending write to a file open with > >> >> O_DIRECT | O_SYNC flag set, file i_size won't be updated even if > >> >> O_SYNC was marked before. > >> >> > >> >> It can be reproduced by a test program in the attachment with > >> >> gcc -o repro repro.c && ./repro testfile && echo c > /proc/sysrq-trigger > >> >> > >> >> After some analysis, we found that before iomap direct I/O conversion, > >> >> the timing was roughly (taking Linux 3.10 codebase as an example): > >> >> > >> >> .. > >> >> - ext4_file_dio_write > >> >> - __generic_file_aio_write > >> >> .. > >> >> - ext4_direct_IO # generic_file_direct_write > >> >> - ext4_ext_direct_IO > >> >> - ext4_ind_direct_IO # final_size > inode->i_size > >> >> - .. > >> >> - ret = blockdev_direct_IO() > >> >> - i_size_write(inode, end) # orphan && ret > 0 && > >> >> # end > inode->i_size > >> >> - ext4_mark_inode_dirty() > >> >> - ... > >> >> - generic_write_sync # handling O_SYNC > >> >> > >> >> So the dirty inode meta will be committed into journal immediately > >> >> if O_SYNC is set. However, After commit 569342dc2485 ("ext4: move > >> >> inode extension/truncate code out from ->iomap_end() callback"), > >> >> the new behavior seems as below: > >> >> > >> >> .. > >> >> - ext4_dio_write_iter > >> >> - ext4_dio_write_checks # extend = 1 > >> >> - iomap_dio_rw > >> >> - __iomap_dio_rw > >> >> - iomap_dio_complete > >> >> - generic_write_sync > >> >> - ext4_handle_inode_extension # extend = 1 > >> > >> Yes, since ext4_handle_inode_extension() will handle inode i_disksize > >> update and mark the inode dirty, generic_write_sync() call should happen > >> after that. > >> > >> That also means then we don't have any generic FS testcase which can > >> validate this behaviour. > > > > Yeah. > > > > Ok. Let me then first send a fstest in response to this integrity > problem with directio and o_sync. Thanks for working on the testcase! I have written a fix in the meantime but so far it causes weird inconsistencies in fsstress ENOSPC hitters tests so I'm still debugging that. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR