When there are a huge number of paths (> 10000) The about of time that the checkerloop can hold the vecs lock for while checking the paths can get to be large enough that it starves other vecs lock users. If path checking takes long enough, it's possible that uxlsnr threads will never run. To deal with this, this patchset makes it possible to drop the vecs lock while checking the paths, and then reacquire it and continue with the next path to check. My choice of only checking if there are waiters every 128 paths checked and only interrupting if path checking has taken more than a second are arbitrary. I didn't want to slow down path checking in the common case where this isn't an issue, and I wanted to avoid path checking getting starved by other vecs->lock users. Having the checkerloop wait for 10000 nsec was based on my own testing with a setup using 4K multipath devies with 4 paths each. This was almost always long enough for the uevent or uxlsnr client to grab the vecs lock, but I'm not sure how dependent this is on details of the system. For instance with my setup in never took more than 20 seconds to check the paths. and usually, a looping through all the paths took well under 10 seconds, most often under 5. I would only occasionally run into situations where a uxlsnr client would time out. Benjamin Marzinski (6): multipathd: Use regular pthread_mutex_t for waiter_lock multipathd: track waiters for mutex_lock multipathd: Occasionally allow waiters to interrupt checking paths multipathd: allow uxlsnr clients to interrupt checking paths multipathd: fix uxlsnr timeout multipathd: Don't check if timespec.tv_sec is zero libmultipath/lock.h | 16 +++++ libmultipath/structs.h | 1 + multipathd/main.c | 144 +++++++++++++++++++++++++---------------- multipathd/uxlsnr.c | 23 +++++-- multipathd/uxlsnr.h | 1 + multipathd/waiter.c | 14 ++-- 6 files changed, 132 insertions(+), 67 deletions(-) -- 2.17.2 -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel