Re: dm-crypt performance regression due to workqueue changes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/30/24 05:49, Mikulas Patocka wrote:

On Sat, 29 Jun 2024, Waiman Long wrote:

On 6/29/24 14:15, Mikulas Patocka wrote:
Hi

I report that the patch 63c5484e74952f60f5810256bd69814d167b8d22
("workqueue: Add multiple affinity scopes and interface to select them")
is causing massive dm-crypt slowdown in virtual machines.

Steps to reproduce:
* Install a system in a virtual machine with 16 virtual CPUs
* Create a scratch file with "dd if=/dev/zero of=Scratch.img bs=1M
    count=2048 oflag=direct" - the file should be on a fast NVMe drive
* Attach the scratch file to the virtual machine as /dev/vdb; cache mode
    should be 'none'
* cryptsetup --force-password luksFormat /dev/vdb
* cryptsetup luksOpen /dev/vdb cr
* fio --direct=1 --bsrange=128k-128k --runtime=40 --numjobs=1
    --ioengine=libaio --iodepth=8 --group_reporting=1
    --filename=/dev/mapper/cr --name=job --rw=read

With 6.5, we get 3600MiB/s; with 6.6 we get 1400MiB/s.

The reason is that virt-manager by default sets up a topology where we
have 16 sockets, 1 core per socket, 1 thread per core. And that workqueue
patch avoids moving work items across sockets, so it processes all
encryption work only on one virtual CPU.

The performance degradation may be fixed with "echo 'system'
/sys/module/workqueue/parameters/default_affinity_scope" - but it is
regression anyway, as many users don't know about this option.

How should we fix it? There are several options:
1. revert back to 'numa' affinity
2. revert to 'numa' affinity only if we are in a virtual machine
3. hack dm-crypt to set the 'numa' affinity for the affected workqueues
4. any other solution?
Another alternative  is to go back to the old "numa" default if the kernel is
running under a hypervisor since the cpu configuration information is likely
to be incorrect anyway. The current default of "cache" will remain if not
under a hypervisor.

Cheers,
Longman
Yes. How could we detect that we run under a hypervisor portably? There's
a flag X86_FEATURE_HYPERVISOR, but it's x86-only.

Right, that will be for x86 only. There is also a kernel boot command line parameter "workqueue.default_affinity_scope=" that one can use to set the default. It will be a bit easier to use than changing sysfs parameter at run time.

Cheers,
Longman





[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux