On 2022/5/17 20:55, Ming Lei wrote: > On Tue, May 17, 2022 at 06:00:57PM +0800, Ziyang Zhang wrote: >> On 2022/5/17 13:53, Ming Lei wrote: >> >>> + >>> +static void ubd_cancel_queue(struct ubd_queue *ubq) >>> +{ >>> + int i; >>> + >>> + for (i = 0; i < ubq->q_depth; i++) { >>> + struct ubd_io *io = &ubq->ios[i]; >>> + >>> + if (io->flags & UBD_IO_FLAG_ACTIVE) { >>> + io->flags &= ~UBD_IO_FLAG_ACTIVE; >>> + io_uring_cmd_done(io->cmd, UBD_IO_RES_ABORT, 0); >>> + } >>> + } >>> +} >> >> Hi Ming, >> >> When ubdsrv sends STOP_DEV and all active IOs in ubd_drv are done(UBD_IO_RES_ABORT), >> there may be still some IOs handled by ubdsrv(UBD_IO_FLAG_ACTIVE not set). >> When these IOs complete and return to ubd_drv, how to handle them? > > Either UBD_IO_COMMIT_AND_FETCH_REQ or UBD_IO_COMMIT_REQ will be sent to ubd_drv > for completing these IOs. And finally ubd_cancel_dev() in ubd driver will > cancel all pending io commands, so io_uring can be exited. I guess > UBD_IO_COMMIT_REQ can be removed too. Yes, I think UBD_IO_COMMIT_REQ can be removed. > >> I find that UBD_IO_FETCH_REQ are still set, >> so will these IOs be issued to ubdsrv again or canceled? >> (I see ubd_drv fails IOs when the daemon is dying >> but maybe here the daemon is still alive) > > If daemon is alive, ubd_drv will rely on ubq_daemon for completing > all inflight IOs. Otherwise, the monitor work will be triggered for > completing/failing inflight IOs. The mechanism is actually very simple: > > static void ubd_stop_dev(struct ubd_device *ub) > { > mutex_lock(&ub->mutex); > if (!disk_live(ub->ub_disk)) > goto unlock; > > del_gendisk(ub->ub_disk); // drain & wait in-flight IOs > ub->dev_info.state = UBD_S_DEV_DEAD; > ub->dev_info.ubdsrv_pid = -1; > ubd_cancel_dev(ub); //No IO is possible now, so cancel pending io commands > unlock: > mutex_unlock(&ub->mutex); > cancel_delayed_work_sync(&ub->monitor_work); > } > > When waiting for IO completion in del_gendisk(), in case that ubq_daemon > is exiting/dying, monitor work will be triggered to call ubd_abort_queue() to > fail in-flight requests for making forward progress. ubd_abort_queue() may > looks a bit tricky to try using task work for aborting request, that > is just for sync with ubd_rq_task_work_fn(). > Thanks for explanation because this part really confuses me. :) But I still concern about the complicity of handling exiting/dying ubq_daemon and aborting queues and I'm trying to find out a simpler way... Another question is that using task_work functions require UBD to be built in kernel. However for users, maybe they are willing to use an external UBD module. Shall we discuss about this now? Regards, Zhang