On Wed, May 26, 2021 at 04:44:06PM +0800, Wang Jianchao wrote: > Right now, discard is issued and waited to be completed in jbd2 > commit kthread context after the logs are committed. When large > amount of files are deleted and discard is flooding, jbd2 commit > kthread can be blocked for long time. Then all of the metadata > operations can be blocked to wait the log space. > > One case is the page fault path with read mm->mmap_sem held, which > wants to update the file time but has to wait for the log space. > When other threads in the task wants to do mmap, then write mmap_sem > is blocked. Finally all of the following read mmap_sem requirements > are blocked, even the ps command which need to read the /proc/pid/ > -cmdline. Our monitor service which needs to read /proc/pid/cmdline > used to be blocked for 5 mins. > > This patch frees the blocks back to buddy after commit and then do > discard in a async kworker context in fstrim fashion, namely, > - mark blocks to be discarded as used if they have not been allocated > - do discard > - mark them free > After this, jbd2 commit kthread won't be blocked any more by discard > and we won't get NOSPC even if the discard is slow or throttled. > > Link: https://marc.info/?l=linux-kernel&m=162143690731901&w=2 > Suggested-by: Theodore Ts'o <tytso@xxxxxxx> > Signed-off-by: Wang Jianchao <wangjianchao@xxxxxxxxxxxx> I tested this patch series atop 5.13.0-rc7 (which took a bit of rebasing), and it works quite well. I tested ext4 on dm-crypt on a laptop NVME drive, using this test case: $ mkdir testdir $ time python3 -c 'for i in range(1000000): open(f"testdir/{i}", "wb").write(b"test data")' $ time rm -r testdir I tried this with and without discard enabled. Without this patch, if discard is enabled, the rm takes many minutes, and stalls out the rest of the system. With this patch, the rm takes the same amount of time (~17s) whether discard is enabled or disabled, and the rest of the system continues to function. Tested-by: Josh Triplett <josh@xxxxxxxxxxxxxxxx>