On Mon, Nov 27, 2023 at 9:29 AM Lennart Poettering <lennart@xxxxxxxxxxxxxx> wrote: > On So, 26.11.23 00:39, Richard Weinberger (richard.weinberger@xxxxxxxxx) wrote: > > > Hello! > > > > After upgrading my main test worker to a recent distribution, the UBI > > test suite [0] fails at various places with -EBUSY. > > The reason is that these tests create and remove UBI volumes rapidly. > > A typical test sequence is as follows: > > 1. creation of /dev/ubi0_0 > > 2. some exclusive operation, such as atomic update or volume resize on > > /dev/ubi0_0 > > 3. removal of /dev/ubi0_0 > > > > Both steps 2 and 3 can fail with -EBUSY because the udev worker still > > holds a file descriptor to /dev/ubi0_0. > > Hmm, I have no experience with UBI, but are you sure we open that? why > would we? are such devices analyzed by blkid? We generally don't open > device nodes unless we have a reason to, such as doing blkid on it or > so. I think it came via commit: dbbf424c8b77 ("rules: ubi mtd - add link to named partitions (#6750)") Here is the bpftrace output of a failed mkvol_basic run. The test created a new volume and tried to delete it via ioctl(). Right after creating the volume, udev started inspecting it and mkvol_basic was unable to delete it because the delete operation needs exclusive ownership. mkvol_basic(530): open() = /dev/ubi0 mkvol_basic(530): ioctl(cmd: 1074032385) (udev-worker)(531): open UBI volume 0 = 0xffff96644533ac80 mkvol_basic(530): open UBI volume 0 = 0xfffffffffffffff0 mkvol_basic(530): failed ioctl() = -16 (udev-worker)(531): closing UBI volume 0xffff96644533ac80 > What precisely fails for you? the open()? or some operation on the > opened fd? All of that. I depends on the test. Basically every test assumes that it has the full ownership of a volume it has created. > > > > FWIW, the problem can also get triggered using UBI's shell utilities > > if the system is fast enough, e.g. > > # ubimkvol -N testv -S 50 -n 0 /dev/ubi0 && ubirmvol -n 0 /dev/ubi0 > > Volume ID 0, size 50 LEBs (793600 bytes, 775.0 KiB), LEB size 15872 > > bytes (15.5 KiB), dynamic, name "testv", alignment 1 > > ubirmvol: error!: cannot UBI remove volume > > error 16 (Device or resource busy) > > > > Instead of adding a retry loop around -EBUSY, I believe the best > > solution is to add code to wait for udev. > > For example, having a udev barrier in ubi_mkvol() and ubi_rmvol() [1] > > seems like a good idea to me. > > For block devices we implement this: > > https://systemd.io/BLOCK_DEVICE_LOCKING > > I understand UBI aren't block devices though? Exactly, UBI volumes are character devices, just like MTDs. > If they conceptually should be considered block device equivalents, we > might want to extend the udev logic to such UBI devices too. Patches > welcome. > > We provide "udevadm lock" to lock a block device according to this > scheme from shell scripts. > > > What function from libsystemd do you suggest for waiting until udev is > > done with rule processing? > > My naive approach, using udev_queue_is_empty() and > > sd_device_get_is_initialized(), does not resolve all failures so far. > > Firstly, udev_queue_is_empty() doesn't seem to be exported by > > libsystemd. I have open-coded it as: > > static int udev_queue_is_empty(void) { > > return access("/run/udev/queue", F_OK) < 0 ? > > (errno == ENOENT ? true : -errno) : false; > > } > > This doesn't really work. udev might still process the device in the > background. I see. -- Thanks, //richard