On 2022-09-01 18:49, Guoqing Jiang wrote: > > > On 9/2/22 2:41 AM, Logan Gunthorpe wrote: >> Hi, >> >> On 2022-08-29 07:15, Yu Kuai wrote: >>> From: Yu Kuai <yukuai3@xxxxxxxxxx> >>> >>> Currently, wait_barrier() will hold 'resync_lock' to read >>> 'conf->barrier', >>> and io can't be dispatched until 'barrier' is dropped. >>> >>> Since holding the 'barrier' is not common, convert 'resync_lock' to use >>> seqlock so that holding lock can be avoided in fast path. >>> >>> Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx> >> I've found some lockdep issues starting with this patch in md-next while >> running mdadm tests (specifically 00raid10 when run about 10 times in a >> row). >> >> I've seen a couple different lock dep errors. The first seems to be >> reproducible on this patch, then it possibly changes to the second on >> subsequent patches. Not sure exactly. > > That's why I said "try mdadm test suites too to avoid regression." ... You may have to run it multiple times, a single run tends not to catch all errors. I had to loop the noted test 10 times to be sure I hit this every time when I did the simple bisect. And ensure that all the debug options are on when you run it (take a look at the Kernel Hacking section in menuconfig). You won't hit this bug without at least CONFIG_PROVE_LOCKING=y. Logan