> On Dec 16, 2022, at 13:31, Michael Trimarchi <michael@xxxxxxxxxxxxxxxxxxxx> wrote: > > [You don't often get email from michael@xxxxxxxxxxxxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > Hi Neil > > On Tue, Apr 26, 2022 at 12:29:55PM +1000, NeilBrown wrote: >> On Thu, 21 Apr 2022, Naresh Kamboju wrote: >>> On Mon, 18 Apr 2022 at 14:09, Naresh Kamboju <naresh.kamboju@xxxxxxxxxx> wrote: >>>> >>>> On Thu, 14 Apr 2022 at 18:45, Greg Kroah-Hartman >>>> <gregkh@xxxxxxxxxxxxxxxxxxx> wrote: >>>>> >>>>> This is the start of the stable review cycle for the 4.19.238 release. >>>>> There are 338 patches in this series, all will be posted as a response >>>>> to this one. If anyone has any issues with these being applied, please >>>>> let me know. >>>>> >>>>> Responses should be made by Sat, 16 Apr 2022 11:07:54 +0000. >>>>> Anything received after that time might be too late. >>>>> >>>>> The whole patch series can be found in one patch at: >>>>> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.238-rc1.gz >>>>> or in the git tree and branch at: >>>>> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y >>>>> and the diffstat can be found below. >>>>> >>>>> thanks, >>>>> >>>>> greg k-h >>>> >>>> >>>> Following kernel warning noticed on arm64 Juno-r2 while booting >>>> stable-rc 4.19.238. Here is the full test log link [1]. >>>> >>>> [ 0.000000] Booting Linux on physical CPU 0x0000000100 [0x410fd033] >>>> [ 0.000000] Linux version 4.19.238 (tuxmake@tuxmake) (gcc version >>>> 11.2.0 (Debian 11.2.0-18)) #1 SMP PREEMPT @1650206156 >>>> [ 0.000000] Machine model: ARM Juno development board (r2) >>>> <trim> >>>> [ 18.499895] ================================ >>>> [ 18.504172] WARNING: inconsistent lock state >>>> [ 18.508451] 4.19.238 #1 Not tainted >>>> [ 18.511944] -------------------------------- >>>> [ 18.516222] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. >>>> [ 18.522242] kworker/u12:3/60 [HC0[0]:SC0[0]:HE1:SE1] takes: >>>> [ 18.527826] (____ptrval____) >>>> (&(&xprt->transport_lock)->rlock){+.?.}, at: xprt_destroy+0x70/0xe0 >>>> [ 18.536648] {IN-SOFTIRQ-W} state was registered at: >>>> [ 18.541543] lock_acquire+0xc8/0x23c >> >> Prior to Linux 5.3, ->transport_lock needs spin_lock_bh() and >> spin_unlock_bh(). >> > > We get the same deadlock or similar one and we think that > can be connected to this thread on 4.19.243. For us is a bit > difficult to hit but we are going to apply this change > > net: sunrpc: Fix deadlock in xprt_destroy > > Prior to Linux 5.3, ->transport_lock needs spin_lock_bh() and > spin_unlock_bh(). > > Signed-off-by: Michael Trimarchi <michael@xxxxxxxxxxxxxxxxxxxx> > --- > net/sunrpc/xprt.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c > index d05fa7c36d00..b1abf4848bbc 100644 > --- a/net/sunrpc/xprt.c > +++ b/net/sunrpc/xprt.c > @@ -1550,9 +1550,9 @@ static void xprt_destroy(struct rpc_xprt *xprt) > * is cleared. We use ->transport_lock to ensure the mod_timer() > * can only run *before* del_time_sync(), never after. > */ > - spin_lock(&xprt->transport_lock); > + spin_lock_bh(&xprt->transport_lock); > del_timer_sync(&xprt->timer); > - spin_unlock(&xprt->transport_lock); > + spin_unlock_bh(&xprt->transport_lock); > > /* > * Destroy sockets etc from the system workqueue so they can > — Agreed. When backporting to kernels that are older than 5.3.x, the transport lock needs to be taken using the bh-safe spin lock variants. Reviewed-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx <mailto:trond.myklebust@xxxxxxxxxxxxxxx>> _________________________________ Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx