Dear maintainers of the nbd subsystem, A user has come across an issue which causes the nbd module to hang after a disconnect where a write has been made to a qemu qcow image file, with qemu-nbd being the server. The issue is easily reproducible with the following: Ubuntu 20.04, 22.04 or Fedora 36 Linux 5.18-rc2 or earlier (have tested 5.18-rc2, 5.15, 5.4, 4.15) QEMU 6.2 or earlier Instructions to reproduce: ========================== $ sudo apt install qemu-nbd $ cat << EOF >> reproducer.sh #!/bin/bash sudo modprobe nbd while : do qemu-img create -f qcow2 foo.img 500M sudo qemu-nbd --disconnect /dev/nbd15 || true sudo qemu-nbd --connect=/dev/nbd15 --cache=writeback --format=qcow2 foo.img sudo mkfs.ext4 -L root -O "^64bit" -E nodiscard /dev/nbd15 sudo qemu-nbd --disconnect /dev/nbd15 done EOF $ chmod +x reproducer.sh $ ./reproducer.sh On Ubuntu, the terminal will pause within a minute or two, and dmesg will report a lot of I/O errors, followed by hung task timeouts. On Fedora, it takes a little longer, but it will break in the same way within 10 minutes. An example kernel log is below: https://paste.ubuntu.com/p/5ZjC5b8MR7/ Debugging done: =============== Looking at syslog, it seems systemd-udevd gets stuck, and enters D state. systemd-udevd[419]: nbd15: Worker [2004] processing SEQNUM=5661 is taking a long time $ ps aux ... 419 1194 root D 0.1 systemd-udevd - I rebooted, and disabled systemd-udevd and its sockets, with: $ sudo systemctl stop systemd-udevd.service $ sudo systemctl stop systemd-udevd-control.socket $ sudo systemctl stop systemd-udevd-kernel.socket When running the reproducer again, everything works fine, and nbd subsystem does not hang. Looking at udev rules, I looked at the differences between those in Ubuntu 18.04 where the issue does not occur, and 20.04, where it does, and came across: /usr/lib/udev/rules.d/60-block.rules In 18.04: # watch metadata changes, caused by tools closing the device node which was opened for writing ACTION!="remove", SUBSYSTEM=="block", KERNEL=="loop*|nvme*|sd*|vd*|xvd*|pmem*|mmcblk*", OPTIONS+="watch" In 20.04: # watch metadata changes, caused by tools closing the device node which was opened for writing ACTION!="remove", SUBSYSTEM=="block", \ KERNEL=="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*|ubi*|scm*|pmem*|nbd*|zd*", \ OPTIONS+="watch" The difference is the OPTIONS+="watch" being added to nbd* any event not remove. When I deleted nbd* and retried the reproducer again, everything works smoothly. Looking at the manpage for udev: > watch > Watch the device node with inotify; when the node is closed after being > opened for writing, a change uevent is synthesized. > > nowatch > Disable the watching of a device node with inotify. It appears that there is some sort of problem where systemd-udevd uses inotify to watch for updates to the device metadata, and it blocks a subsequent disconnect request, causing it to fail with: block nbd15: Send disconnect failed -32 After which we start seeing stuck requests: block nbd15: Possible stuck request 000000007fcf62ba: control (read@523915264,24576B). Runtime 30 seconds All userspace calls to the nbd module hang, and the system has to be rebooted. Workaround: =========== We can workaround the issue by adding a higher priority udev rule to not watch nbd* devices. $ cat << EOF >> /etc/udev/rules.d/97-nbd-device.rules # Disable inotify watching of change events for NBD devices ACTION=="add|change", KERNEL=="nbd*", OPTIONS:="nowatch" EOF $ sudo udevadm control --reload-rules $ sudo udevadm trigger Help on debugging the problem: ============================== I need some help debugging the problem, as I am not quite sure how to trace the interactions between inotify and nbd. I am happy to help debug the issue, or try any patches that gather debugging data or any potential fixes. Thanks, Matthew