On Oct 04, 2022 / 20:10, Tetsuo Handa wrote: > On 2022/10/04 19:44, Shinichiro Kawasaki wrote: > > Any comment on this patch will be appreciated. If this action approach is ok, > > I'll post as a formal patch for review. > > I don't want you to make such change. > > I saw a case where real deadlock was hidden by lockdep_set_novalidate_class(). > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=62ebaf2f9261cd2367ae928a39343fcdbfe9f877 > https://groups.google.com/g/syzkaller-bugs/c/Uj9LqEUCwac/m/BhdTjWhNAQAJ > > In general, this kind of deadlock possibility had better be addressed by bringing > problematic locks out of cancel{,_delayed}_work_sync() section. > > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=a91b750fd6629354460282bbf5146c01b05c4859 Thanks for the comment. Then, I think the question is how to move the blk_sync_queue() call out of the ctrl->namespaces_rwsem critical section in nvme_sync_io_queues(): void nvme_sync_io_queues(struct nvme_ctrl *ctrl) { struct nvme_ns *ns; down_read(&ctrl->namespaces_rwsem); list_for_each_entry(ns, &ctrl->namespaces, list) blk_sync_queue(ns->queue); up_read(&ctrl->namespaces_rwsem); } I'm not yet sure how we can do it. I guess it is needed to copy the ctrl->namespaces list to a temporary array to refer out of the critical section. Also need to keep kref of each ns not to free. Will try to implement tomorrow. -- Shin'ichiro Kawasaki