On Fri, Nov 16 2018 at 2:25am -0500, Hannes Reinecke <hare@xxxxxxx> wrote: > On 11/15/18 6:46 PM, Mike Snitzer wrote: > >Whether or not ANA is present is a choice of the target implementation; > >the host (and whether it supports multipathing) has _zero_ influence on > >this. If the target declares a path as 'inaccessible' the path _is_ > >inaccessible to the host. As such, ANA support should be functional > >even if native multipathing is not. > > > >Introduce ability to always re-read ANA log page as required due to ANA > >error and make current ANA state available via sysfs -- even if native > >multipathing is disabled on the host (e.g. nvme_core.multipath=N). > > > >This affords userspace access to the current ANA state independent of > >which layer might be doing multipathing. It also allows multipath-tools > >to rely on the NVMe driver for ANA support while dm-multipath takes care > >of multipathing. > > > >While implementing these changes care was taken to preserve the exact > >ANA functionality and code sequence native multipathing has provided. > >This manifests as native multipathing's nvme_failover_req() being > >tweaked to call __nvme_update_ana() which was factored out to allow > >nvme_update_ana() to be called independent of nvme_failover_req(). > > > >And as always, if embedded NVMe users do not want any performance > >overhead associated with ANA or native NVMe multipathing they can > >disable CONFIG_NVME_MULTIPATH. > > > >Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx> > >--- > > drivers/nvme/host/core.c | 10 +++++---- > > drivers/nvme/host/multipath.c | 49 +++++++++++++++++++++++++++++++++---------- > > drivers/nvme/host/nvme.h | 4 ++++ > > 3 files changed, 48 insertions(+), 15 deletions(-) > > > >diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > >index fe957166c4a9..3df607905628 100644 > >--- a/drivers/nvme/host/core.c > >+++ b/drivers/nvme/host/core.c > >@@ -255,10 +255,12 @@ void nvme_complete_rq(struct request *req) > > nvme_req(req)->ctrl->comp_seen = true; > > if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) { > >- if ((req->cmd_flags & REQ_NVME_MPATH) && > >- blk_path_error(status)) { > >- nvme_failover_req(req); > >- return; > >+ if (blk_path_error(status)) { > >+ if (req->cmd_flags & REQ_NVME_MPATH) { > >+ nvme_failover_req(req); > >+ return; > >+ } > >+ nvme_update_ana(req); > > } > > if (!blk_queue_dying(req->q)) { ... > >diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c > >index 8e03cda770c5..0adbcff5fba2 100644 > >--- a/drivers/nvme/host/multipath.c > >+++ b/drivers/nvme/host/multipath.c > >@@ -58,25 +87,22 @@ void nvme_failover_req(struct request *req) > > spin_unlock_irqrestore(&ns->head->requeue_lock, flags); > > blk_mq_end_request(req, 0); > >- switch (status & 0x7ff) { > >- case NVME_SC_ANA_TRANSITION: > >- case NVME_SC_ANA_INACCESSIBLE: > >- case NVME_SC_ANA_PERSISTENT_LOSS: > >+ if (nvme_ana_error(status)) { > > /* > > * If we got back an ANA error we know the controller is alive, > > * but not ready to serve this namespaces. The spec suggests > > * we should update our general state here, but due to the fact > > * that the admin and I/O queues are not serialized that is > > * fundamentally racy. So instead just clear the current path, > >- * mark the the path as pending and kick of a re-read of the ANA > >+ * mark the path as pending and kick off a re-read of the ANA > > * log page ASAP. > > */ > > nvme_mpath_clear_current_path(ns); > >- if (ns->ctrl->ana_log_buf) { > >- set_bit(NVME_NS_ANA_PENDING, &ns->flags); > >- queue_work(nvme_wq, &ns->ctrl->ana_work); > >- } > >- break; > >+ __nvme_update_ana(ns); > >+ goto kick_requeue; > >+ } > >+ > >+ switch (status & 0x7ff) { > > case NVME_SC_HOST_PATH_ERROR: > > /* > > * Temporary transport disruption in talking to the controller. > >@@ -93,6 +119,7 @@ void nvme_failover_req(struct request *req) > > break; > > } > >+kick_requeue: > > kblockd_schedule_work(&ns->head->requeue_work); > > } > Doesn't the need to be protected by 'if (ns->head->disk)' or somesuch? No. nvme_failover_req() is only ever called by native multipathing; see nvme_complete_rq()'s check for req->cmd_flags & REQ_NVME_MPATH as the condition for calling nvme_complete_rq(). The previos RFC-style patch I posted muddled ANA and multipathing in nvme_update_ana() but this final patch submission was fixed because I saw a cleaner way forward by having nvme_failover_req() also do ANA work just like it always has -- albeit with new helpers that nvme_update_ana() also calls. Mike