On Tue, Jul 30, 2024 at 12:25 PM <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > > > The patch below does not apply to the 6.10-stable tree. > If someone wants it applied there, or to any other stable or longterm > tree, then please email the backport, including the original git commit > id to <stable@xxxxxxxxxxxxxxx>. > > To reproduce the conflict and resubmit, you may use the following commands: > > git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y > git checkout FETCH_HEAD > git cherry-pick -x 2237ceb71f89837ac47c5dce2aaa2c2b3a337a3c > # <resolve conflicts, build, test, etc.> > git commit -s > git send-email --to '<stable@xxxxxxxxxxxxxxx>' --in-reply-to '2024073021-strut-specimen-8aad@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^.. > > Possible dependencies: > > 2237ceb71f89 ("rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings") > > thanks, > > greg k-h > > ------------------ original commit in Linus's tree ------------------ > > From 2237ceb71f89837ac47c5dce2aaa2c2b3a337a3c Mon Sep 17 00:00:00 2001 > From: Ilya Dryomov <idryomov@xxxxxxxxx> > Date: Tue, 23 Jul 2024 18:07:59 +0200 > Subject: [PATCH] rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive > mappings > > Every time a watch is reestablished after getting lost, we need to > update the cookie which involves quiescing exclusive lock. For this, > we transition from RBD_LOCK_STATE_LOCKED to RBD_LOCK_STATE_QUIESCING > roughly for the duration of rbd_reacquire_lock() call. If the mapping > is exclusive and I/O happens to arrive in this time window, it's failed > with EROFS (later translated to EIO) based on the wrong assumption in > rbd_img_exclusive_lock() -- "lock got released?" check there stopped > making sense with commit a2b1da09793d ("rbd: lock should be quiesced on > reacquire"). > > To make it worse, any such I/O is added to the acquiring list before > EROFS is returned and this sets up for violating rbd_lock_del_request() > precondition that the request is either on the running list or not on > any list at all -- see commit ded080c86b3f ("rbd: don't move requests > to the running list on errors"). rbd_lock_del_request() ends up > processing these requests as if they were on the running list which > screws up quiescing_wait completion counter and ultimately leads to > > rbd_assert(!completion_done(&rbd_dev->quiescing_wait)); > > being triggered on the next watch error. > > Cc: stable@xxxxxxxxxxxxxxx # 06ef84c4e9c4: rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait Hi Greg, Please grab commit f5c466a0fdb2 ("rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait") as a prerequisite for this one. I forgot to adjust the SHA in the tag that specifies it after a rebase, sorry. This applies to all stable kernels. Thanks, Ilya