On Mon 27-03-23 18:28:55, Chung-Chiang Cheng wrote: > On Mon, Mar 27, 2023 at 5:29 PM Jan Kara <jack@xxxxxxx> wrote: > > > > As Zhang Yi already noted in his review, this is expected at least with > > data=writeback mount option. With data=ordered this should not happen > > though as the commit of the transaction with i_disksize update will wait > > for page writeback to complete (this is exactly the reason why data=ordered > > exists after all). Are you able to observe this problem with data=ordered > > mount option? > > > > Honza > > It's a pity that this issue also occurs with data=ordered due to delayed > allocation being enabled by default. If delayed allocation were disabled, > it would not be as easy to reproduce. Ah, ok. With data=ordered and expanding within the last block, you are right you can see zeros at the end of the file after a crash. We were discussing this in the past already but decided not to improve this because the fix would have performance cost we didn't want to impose on users. > This is because if data is written to the end of a file and the block is > allocated, the new i_disksize will be immediately committed to the journal > at ext4_da_write_end(), but the writeback procedure is not yet triggered. > By default, ext4 commits the journal every 5 seconds, but a dirty page may > not be written back until 30 seconds later. This is not a short time window, > and any improper shutdown during this time may lead to the issue :( Yeah, I agree. The time window is not small. What we could do and what could even bring some performance benefit is if we moved the i_disksize update from ext4_da_write_end() to ext4_do_writepages(). Currently we do the i_disksize update only in mpage_map_and_submit_extent() but we could add a similar logic when exiting from ext4_do_writepages() to update i_disksize for written back pages beyond i_disksize which didn't need block allocation. *Except* there is a problem that we couldn't do this i_disksize update when the pages are written from jbd2 during ordered data writeback (we cannot start transaction in that context). And this is nasty because we will completely loose the i_disksize update. We could handle it by redirtying the tail page in this case but it gets a bit ugly... Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR