Re: [PATCH] nvme: mark ctrl as DEAD if removing from error recovery

Sagi Grimberg <sagi@xxxxxxxxxxx> · Mon, 10 Jul 2023 17:00:41 +0300

OK, I got your idea now, but which is basically not doable from current
nvme error recovery approach.

Even though starting freeze is moved to reconnect stage, queue is still
quiesced, then request is kept in block layer's internal queue, and can't
enter nvme fabric .queue_rq().

See error-recovery in nvme_[tcp|rdma]_error_recovery(), the queue is
unquiesced for fast-failover.

OK, sorry for missing the nvme_unquiesce_io_queues() called in
nvme_tcp_error_recovery_work().

After moving start_freeze to nvme_tcp_reconnect_ctrl_work, new request
can enter queue quickly, and all these requests may not be handled
after reconnection is done because queue topo may change. It looks not
an issue for mpath, but could be one trouble for !mpath, just like
nvme-pci. I guess you don't care !mpath?

First, !mpath is less of a concern for fabrics, although it should still
work.

In the !mpath case, if the cpu topology changes, then this needs to be
addressed specifically, and it is a secondary issue, far less important
than not failing over quickly.

nvme_tcp_queue_rq() highly depends on ctrl state for handling request
during error recovery, and this way is actually fragile, such as:

1) nvme_unquiesce_io_queues() has to be done after ctrl state is changed
to NVME_CTRL_CONNECTING, in nvme_tcp_error_recovery_work().

At least we should be careful for this change.

queue_rq() depends on ctrl->state but also queue state, and the latter
is guaranteed to be stable across quiesce/unquiesce. This is already
the case today.