On Mon, 19 Apr 2021, Melvin Vermeeren wrote: > Note: This was originally posted on cryptsetup GitLab. > Note: Reposting here for better visibility as it appears to be a kernel bug. > Ref: https://gitlab.com/cryptsetup/cryptsetup/-/issues/639 > > Issue description > ----------------- > > With a Seagate FireCuda 520 2TB NVMe SSD running in PCIe 3.0 x4 mode (my > motherboard does not have PCIe 4.0), discards through `dm-integrity` > layer are extremely slow to the point of being almost unusable or in > some cases fully unusable. > > This is so slow that having the `discard` option on swap in not > possible, as it takes around 3 minutes to complete for 32GiB swap > causing timeouts during boot which in turn causes various other services > to fail resulting in a drop to the emergency shell. > > `blkdiscard` directly to NVMe device takes I think 10 sec or so for the > entire 2TB, but through `dm-integrity` the rate is approx 10GiB per > minute, meaning over 3 hours to discard the entire 2TB. Normal read and > write operations are not affected and are high performance, easily > reaching 2GiB/s through the entire layer: `disk dm-integrity mdadm luks > lvm ext4`. > > Checking the kernel thread usage in htop quite some > `dm-integrity-offload` threads are in the `D` state with `0.0` CPU usage > when discarding, which is rather odd. No integrity threads are actually > working and read-write disk usage measured with `dstat` is not even > 1MiB/s. > > To detail the above, `dstat` shows extremely clear timings: 2 seconds 0k > write, 1 second 512k write, repeat. Possible timeout in locks somewhere > or other problematic lock situation? > > Steps for reproducing the issue > ------------------------------- > > 1. Create two 10G partitions on SSD. > 2. Setup `dm-integrity` on one of these and open the device with `--allow- > discards`. > 3. `blkdiscard` both partitions. > * Raw partition is done instantly. > * Integrity partition takes around a minute. > > Additional info > --------------- > > The NVMe device is formatted to native 4096 byte sectors and the `dm- > integrity` layer also uses 4096 byte sectors. > > Debian bullseye (testing), kernel 5.10.0-6-rt-amd64 5.10.28-1. Same issue > occurred during testing with Arch Linux liveiso which is kernel 5.11.x. > Cryptsetup package version 2.3.5. > > On another server system (IBM POWER9, ppc64le) with SAS 3.0 SSD discard is > working properly at more than acceptable speeds, showing significant CPU usage > while discarding. In this case it is a regular Intel amd64 desktop system. > > Debug log > --------- > > Nothing really fails, dmesg and syslog show no issues/warnings at all, not > sure what to include. > > Only appears to effect NVMe > --------------------------- > > Further tests on the same machine show that SATA SSD is not affected by this > issue and discards at high performance. Appears to be NVMe-specific bug: > Ref: https://gitlab.com/cryptsetup/cryptsetup/-/issues/639#note_555208783 I tried it on my nvme device (Samsung SSD 960 EVO 500GB) and I could discard 32GB in 5 seconds. I assume that it is specific to the nvme device you are using. The device is perhaps slow due to a mix of dicard+read+write requests that dm-integrity generates. > If there is anything I can do to help feel free to let me know. > Note that I am not subscribed to dm-level, please CC me directly. > > Thanks, Could you try it on other nvme disks? Mikulas -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel