On 2022/7/18 15:35, Ming Lei wrote: > Hi Christoph, > > On Sun, Jul 17, 2022 at 11:38:25PM -0700, Christoph Hellwig wrote: >> Hi Ming, >> >> it seems like ublk uses schedule_work to stop the device, which >> includes a del_gendisk. I'm a little fearful this will gets us into >> lockdep chains of death once syzbot or Tetsu notice it. > > ublksrv has two built-in tests(generic/001, generic/002) for covering > heavy io with device removal and killing ubq_daemon, not see lockdep > warning when running the two tests with lockdep enabled. > > Could you or Tetsu provide a bit more info about the warning? > >> >> That being said, I don't reall understand the design of >> ublk_daemon_monitor_work, which is only used to kick off other >> work to start with. > > If the ubq daemon becomes dead, ublk_daemon_monitor_work will be > scheduled for handling the error: abort pending io requests, and > start to delete disk. > > It has to be triggered when del_gendisk() is in-progress for making > forward progress. > > > Thanks, > Ming Hi Ming, Just to make sure I understand usage of ublk_daemon_monitor_work correctly. 1) For a dying ubq daemon, ublk_daemon_monitor_work schedule stop work first. 2) The stop work calls del_gendisk() and it is blocked because there are pending blk-mq requests(maybe handling in ublksrv target). 3) Meanwhile, the monitor work aborts all pending blk-mq IOs (with UBLK_IO_FLAG_ACTIVE unset) by blk_mq_end_request(req, BLK_STS_IOERR). 4) After all pending blk-mq reqs are aborted, del_gendisk() in stop work returns and /dev/ublkbX is removed. No more blk-mq reqs. 5) In stop work, cancel all queued(with UBLK_IO_FLAG_ACTIVE set) ublk IOs by io_uring_cmd_done(io->cmd, UBLK_IO_RES_ABORT, 0) and ublksrv won't issue sqes again. Hope I am correct and all of these works looks good to me. Besides, the tests(generic/001, generic/002) run successfully for me. Regards, Ziyang Zhang