On 4/8/22 17:10, Bob Pearson wrote: > Bart, > > I finally was able to build a kernel with lockdep enabled correctly and saw the error that you and others reported. > I am not familiar with lockdep output but I am guessing that it is reporting a mismatch between a _bh spinlock > and a _irqsave spinlock (since those are the only two types used by the driver.) > > I went on campaign a while back to replace all the locks with _bh locks because I figured they would be > faster than _irqsave locks and because the driver never touched a lock except from a verbs API call or from > a tasklet (softirq.) As it turned out some code makes verbs API calls while in hardirq context which broke > that assumption. So some of the locks were reverted back to irqsave locks which fixed those warnings. > > Now it is happening again. I did an experiment and went through the rxe driver and replaced all spinlocks > with _irqsave locks. Now the lockdep splats have gone away and the srp/001 test reports success. BUT, > it hangs and doesn't finish. If I try to run all the tests I get warnings about unable to remove the > scsi_debug driver. I am able to remove the rdma_rxe driver and reload it. I am not seeing any errors in > the rxe driver. > > Do you have any ideas what to look at next? > > Bob Actually it doesn't hang forever but I get the following ...... [ 107.579576] sd 4:0:0:0: [sdb] Synchronizing SCSI cache [ 291.970133] sd 4:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK [ 292.247547] rdma_rxe: unloaded So it waits for about 3 minutes for something and then gives up. Bob