On 12/09/2017 02:18 AM, James Smart wrote: > NVME targets appear to randomly disconnect from the initiator > when running heavy IO. > > The error is due to the host aggregate (across all controllers) > io load was beyond the maximum exchange count for nvme on the > adapter. The driver was properly returning a resource busy status, > but the io load was so great heartbeat commands would be bounced > and not have a successful retry within the fuzz amount for the > nvme heartbeat (yes, a very high io load!). Thus the target was > terminating the controller due to a keep alive failure. > > Resolve by reserving a few exchanges (by counters) which can be > used when the adapter is out of normal exchanges and the command > is a NVME heartbeat command. As counters are used, while the > reserved command is outstanding, as soon as any other exchange > completes, the counters are adjusted and the reserved count is > replenished. The heartbeat completes execution in a normal fashion. > > Signed-off-by: Dick Kennedy <dick.kennedy@xxxxxxxxxxxx> > Signed-off-by: James Smart <james.smart@xxxxxxxxxxxx> > --- > drivers/scsi/lpfc/lpfc.h | 2 ++ > drivers/scsi/lpfc/lpfc_init.c | 16 ++++++++++- > drivers/scsi/lpfc/lpfc_nvme.c | 66 +++++++++++++++++++++++++++++-------------- > drivers/scsi/lpfc/lpfc_nvme.h | 1 + > 4 files changed, 63 insertions(+), 22 deletions(-) > Reviewed-by: Hannes Reinecke <hare@xxxxxxxx> Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@xxxxxxx +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)