On 2022/04/05 15:44, Christoph Hellwig wrote: > On Mon, Apr 04, 2022 at 02:12:14PM +0900, Tetsuo Handa wrote: >> On 2022/04/04 13:58, Christoph Hellwig wrote: >> My patch proposes filemap_invalidate_lock_killable() and converts only >> blkdev_fallocate() case as a starting point. Nothing prevents us from >> converting e.g. blk_ioctl_zeroout() case as well. The "not come through >> blkdev_fallocate" is bogus. > > Sure, we could try to convert most of the > 50 instances of > filemap_invalidate_lock to be killable. But that: > > a) isn't what your patch actuall did We can step by step convert all possible locations. > b) doesn't solve the underlying issue that is wasn't designed to to be > held over very extremely long running operations Do you want to introduce state variables like Lo_rundown in order to avoid holding this invalidate lock? That will be a lot of complication. > > Or to get back to what I said before - I don't think we can just hold > the lock over manually zeroing potentially gigabytes of blocks. > Periodically releasing this invalidate lock may result in inconsistent output. One can issue conflicting request (e.g. enlarging a file and truncating that file). If truncate happens while partially enlarged (or writing non-zero values while partially zeroed), resuming will need to undo what was done during this invalidate lock was released. (Or do you want to make conflicting requests fail with e.g. -EBUSY, possibly breaking userspace programs?) Serialization via holding throughout this zeroing/fallocate operation is needed for returning consistent output. > In other words: we'll need to chunk the zeroing up if we want > to hold the invalidate lock, I see no ther way to properly fix this. I don't think that we can chunk the zeroing/fallocate up, for I don't think that we can determine appropriate chunk size. It might be 4KB, it might be 1MB, it might be 1GB, but who knows which size is safe for avoiding hung task warning on this invalidate lock. The block device might be super slow (e.g. as slow as floppy disk, or networking block device over very slow network). Also, hung task watchdog timeout and concurrent requests colliding on this invalidate lock are not visible to current thread doing actual zeroing/fallocate. All we could afford will be to make this invalidate lock killable and sometimes check fatal_signal_pending() inside zeroing/fallocate loop.