Re: [PATCH] dm zoned: Drop the WQ_UNBOUND flag for the chunk workqueue

Shinichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> · Wed, 24 Apr 2024 00:36:45 +0000

On Apr 23, 2024 / 06:31, Tejun Heo wrote:
> On Mon, Apr 22, 2024 at 02:43:48PM -1000, Tejun Heo wrote:
> > workqueue: The default node_nr_active should have its max set to max_active
> > 
> > The default nna (node_nr_active) is used when the pool isn't tied to a
> > specific NUMA node. This can happen in the following cases:
> > 
> >  1. On NUMA, if per-node pwq init failure and the fallback pwq is used.
> >  2. On NUMA, if a pool is configured to span multiple nodes.
> >  3. On single node setups.
> > 
> > 5797b1c18919 ("workqueue: Implement system-wide nr_active enforcement for
> > unbound workqueues") set the default nna->max to min_active because only #1
> > was being considered. For #2 and #3, using min_active means that the max
> > concurrency in normal operation is pushed down to min_active which is
> > currently 8, which can obviously lead to performance issues.
> > 
> > #1 is very unlikely to happen to begin with and even when it does which
> > exact value nna->max is set to doesn't really matter. #2 can only happen if
> > the workqueue is intentionally configured to ignore NUMA boundaries and
> > there's no good way to distribute max_active in this case. #3 is the default
> > behavior on single node machines.
> > 
> > Let's set it the default nna->max to max_active. This fixes the artificially
> > lowered concurrency problem on single node machines and shouldn't hurt
> > anything for other cases.
> > 
> > Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
> > Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@xxxxxxx>
> > Fixes: 5797b1c18919 ("workqueue: Implement system-wide nr_active enforcement for unbound workqueues")
> > Link: http://lkml.kernel.org/r/20240410082822.2131994-1-shinichiro.kawasaki@xxxxxxx
> 
> Apply to wq/for-6.9-fixes.

Hello Tejun, thanks for the fix. I confirmed that the number of in-flight works
become larger than 8 during the unmount operation.

                            total  infl  CPUtime CPUitsv CMW/RPR  mayday rescued
dmz_cwq_dmz_dml_072           613    33      4.6       -       0       0       0

                            total  infl  CPUtime CPUitsv CMW/RPR  mayday rescued
dmz_cwq_dmz_dml_072           617    33      4.7       -       0       0       0

                            total  infl  CPUtime CPUitsv CMW/RPR  mayday rescued
dmz_cwq_dmz_dml_072           619    33      4.8       -       0       0       0

Also I measured xfs unmount time of 10 times. Its avarage is as follows.

 Kernel               | Unmount time
----------------------+--------------
 v6.8                 |   29m  3s
 v6.9-rc2             |   34m 17s
 v6.9-rc2 + Tejun fix |   30m 55s

We can see that the fix reduced the unmount time, which is great! Still there is
a gap from v6.8 kernel. I think this gap can be a left work, and hope the fix
patch to be upstreamed.

BTW, the URL of the Link tag is not working. Instead, I suggest this:

  https://lore.kernel.org/dm-devel/20240410084531.2134621-1-shinichiro.kawasaki@xxxxxxx/