On Tue, May 18, 2021 at 09:19:13AM +0800, Wang Jianchao wrote: > > That way we don't need to move all of this to a kworker context. > > The submit_bio also needs to be out of jbd2 commit kthread as it may be > blocked due to blk-wbt or no enough request tag. ;) Actually, there's a bigger deal that I hadn't realized, about why we is why are currently using submit_bio_wait(). We *must* wait until discard has completed before we call ext4_free_data_in_buddy(), which is what allows those blocks to be reused by the block allocator. If the discard happens after we reallocate the block, there is a good chance that we will end up corrupting a data or metadata block, leading to user data loss. There's another corollary to this; if you use blk-wbt, and you are doing lots of deletes, and we move this all to a writeback thread, this *significantly* increases the chance that the user will see ENOSPC errors in the case where they are with a very full (close to 100% used) file system. I'd argue that this is a *really* good reason why using mount -o discard is Just A Bad Idea if you are running with blk-wbt. If discards are slow, using fstrim is a much better choice. It's also the case that for most SSD's and workloads, doing frequent discards doesn't actually help that much. The write endurance of the device is not compromised that much if you only run fs-trim and discard unused blocks once a day, or even once a week --- I only recommend use of mount -o discard in cases where the discard operation is effectively free. (e.g., in cases where the FTL is implemented on the Host OS, or you are running with super-fast flash which is PCIe or NVMe attached.) Cheers, - Ted