Re: [PATCH] mm/page-writeback: Raise wb_thresh to prevent write blocking with strictlimit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Wed, 23 Oct 2024 18:00:32 +0800 Jim Zhao <jimzhao.ai@xxxxxxxxx> wrote:

> > With the strictlimit flag, wb_thresh acts as a hard limit in
> > balance_dirty_pages() and wb_position_ratio(). When device write
> > operations are inactive, wb_thresh can drop to 0, causing writes to
> > be blocked. The issue occasionally occurs in fuse fs, particularly
> > with network backends, the write thread is blocked frequently during
> > a period. To address it, this patch raises the minimum wb_thresh to a
> > controllable level, similar to the non-strictlimit case.

> Please tell us more about the userspace-visible effects of this.  It
> *sounds* like a serious (but occasional) problem, but that is unclear.

> And, very much relatedly, do you feel this fix is needed in earlier
> (-stable) kernels?

The problem exists in two scenarios:
1. FUSE Write Transition from Inactive to Active

sometimes, active writes require several pauses to ramp up to the appropriate wb_thresh.
As shown in the trace below, both bdi_setpoint and task_ratelimit are 0, means wb_thresh is 0. 
The dd process pauses multiple times before reaching a normal state.

dd-1206590 [003] .... 62988.324049: balance_dirty_pages: bdi 0:51: limit=295073 setpoint=259360 dirty=454 bdi_setpoint=0 bdi_dirty=32 dirty_ratelimit=18716 task_ratelimit=0 dirtied=32 dirtied_pause=32 paused=0 pause=4 period=4 think=0 cgroup_ino=1
dd-1206590 [003] .... 62988.332063: balance_dirty_pages: bdi 0:51: limit=295073 setpoint=259453 dirty=454 bdi_setpoint=0 bdi_dirty=33 dirty_ratelimit=18716 task_ratelimit=0 dirtied=1 dirtied_pause=0 paused=0 pause=4 period=4 think=4 cgroup_ino=1
dd-1206590 [003] .... 62988.340064: balance_dirty_pages: bdi 0:51: limit=295073 setpoint=259526 dirty=454 bdi_setpoint=0 bdi_dirty=34 dirty_ratelimit=18716 task_ratelimit=0 dirtied=1 dirtied_pause=0 paused=0 pause=4 period=4 think=4 cgroup_ino=1
dd-1206590 [003] .... 62988.348061: balance_dirty_pages: bdi 0:51: limit=295073 setpoint=259531 dirty=489 bdi_setpoint=0 bdi_dirty=35 dirty_ratelimit=18716 task_ratelimit=0 dirtied=1 dirtied_pause=0 paused=0 pause=4 period=4 think=4 cgroup_ino=1
dd-1206590 [003] .... 62988.356063: balance_dirty_pages: bdi 0:51: limit=295073 setpoint=259531 dirty=490 bdi_setpoint=0 bdi_dirty=36 dirty_ratelimit=18716 task_ratelimit=0 dirtied=1 dirtied_pause=0 paused=0 pause=4 period=4 think=4 cgroup_ino=1
...

2. FUSE with Unstable Network Backends and Occasional Writes
Not easy to reproduce, but when it occurs in this scenario, 
it causes the write thread to experience more pauses and longer durations.


Currently, some code is in place to improve this situation, but seems insufficient:
if (dtc->wb_dirty < 8)
{
	// ...
}

So the patch raise min wb_thresh to keep the occasional writes won't be blocked and
active writes can rampup the threshold quickly.

--

Thanks,
Jim Zhao





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux