On Tue, Jul 30, 2024 at 12:54:52PM +0200, Ilya Dryomov wrote: > On Tue, Jul 30, 2024 at 12:25 PM <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: > > > > > > The patch below does not apply to the 6.10-stable tree. > > If someone wants it applied there, or to any other stable or longterm > > tree, then please email the backport, including the original git commit > > id to <stable@xxxxxxxxxxxxxxx>. > > > > To reproduce the conflict and resubmit, you may use the following commands: > > > > git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y > > git checkout FETCH_HEAD > > git cherry-pick -x 2237ceb71f89837ac47c5dce2aaa2c2b3a337a3c > > # <resolve conflicts, build, test, etc.> > > git commit -s > > git send-email --to '<stable@xxxxxxxxxxxxxxx>' --in-reply-to '2024073021-strut-specimen-8aad@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^.. > > > > Possible dependencies: > > > > 2237ceb71f89 ("rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings") > > > > thanks, > > > > greg k-h > > > > ------------------ original commit in Linus's tree ------------------ > > > > From 2237ceb71f89837ac47c5dce2aaa2c2b3a337a3c Mon Sep 17 00:00:00 2001 > > From: Ilya Dryomov <idryomov@xxxxxxxxx> > > Date: Tue, 23 Jul 2024 18:07:59 +0200 > > Subject: [PATCH] rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive > > mappings > > > > Every time a watch is reestablished after getting lost, we need to > > update the cookie which involves quiescing exclusive lock. For this, > > we transition from RBD_LOCK_STATE_LOCKED to RBD_LOCK_STATE_QUIESCING > > roughly for the duration of rbd_reacquire_lock() call. If the mapping > > is exclusive and I/O happens to arrive in this time window, it's failed > > with EROFS (later translated to EIO) based on the wrong assumption in > > rbd_img_exclusive_lock() -- "lock got released?" check there stopped > > making sense with commit a2b1da09793d ("rbd: lock should be quiesced on > > reacquire"). > > > > To make it worse, any such I/O is added to the acquiring list before > > EROFS is returned and this sets up for violating rbd_lock_del_request() > > precondition that the request is either on the running list or not on > > any list at all -- see commit ded080c86b3f ("rbd: don't move requests > > to the running list on errors"). rbd_lock_del_request() ends up > > processing these requests as if they were on the running list which > > screws up quiescing_wait completion counter and ultimately leads to > > > > rbd_assert(!completion_done(&rbd_dev->quiescing_wait)); > > > > being triggered on the next watch error. > > > > Cc: stable@xxxxxxxxxxxxxxx # 06ef84c4e9c4: rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait > > Hi Greg, > > Please grab commit f5c466a0fdb2 ("rbd: rename RBD_LOCK_STATE_RELEASING > and releasing_wait") as a prerequisite for this one. I forgot to adjust > the SHA in the tag that specifies it after a rebase, sorry. > > This applies to all stable kernels. Now done, thanks. I was wondering about that invalid sha1, odd that the linux-next scripts didn't catch it :( greg k-h