----- Original Message ----- > From: "Bart Van Assche" <bart.vanassche@xxxxxxxxxxx> > To: "Christoph Hellwig" <hch@xxxxxx> > Cc: "Doug Ledford" <dledford@xxxxxxxxxx>, "Sagi Grimberg" <sagi@xxxxxxxxxxx>, "Laurence Oberman" > <loberman@xxxxxxxxxx>, linux-rdma@xxxxxxxxxxxxxxx > Sent: Tuesday, May 3, 2016 5:13:32 PM > Subject: Re: [PATCH 11/11] IB/srp: Prevent mapping failures > > On 05/03/2016 02:33 AM, Christoph Hellwig wrote: > > On Fri, Apr 22, 2016 at 02:16:31PM -0700, Bart Van Assche wrote: > >> If both max_sectors and the queue_depth are high enough it can > >> happen that the MR pool is depleted temporarily. This causes > >> the SRP initiator to report mapping failures. Although the SRP > >> initiator recovers from such mapping failures, prevent that > >> this can happen by limiting max_sectors. > > > > FYI, even with this patch I see tons of errors like: > > > > [ 2237.161106] scsi host7: ib_srp: Failed to map data (-12) > > That's unintended. I can reproduce this and will analyze this further. > > >> + /* > >> + * FR and FMR can only map one HCA page per entry. If the > >> + * start address is not aligned on a HCA page boundary two > >> + * entries will be used for the head and the tail although > >> + * these two entries combined contain at most one HCA page of > >> + * data. Hence the "- 1" in the calculation below. > >> + */ > >> + max_max_sectors = (srp_dev->max_pages_per_mr - 1) << > >> + (ilog2(srp_dev->mr_page_size) - 9); > >> + if (target->scsi_host->max_sectors > max_max_sectors) { > >> + shost_printk(KERN_WARNING, target->scsi_host, > >> + PFX "Reducing max_sectors from %d to %d\n", > >> + target->scsi_host->max_sectors, > >> + max_max_sectors); > >> + target->scsi_host->max_sectors = max_max_sectors; > >> + } > > > > I don't think there is any good reason to printk a warning here - > > limited hardware is a totally normal thing. E.g. if we merge > > your RDMA/CM support and someone runs SRP on chelsio hardware they'd > > probably hit this all the time.. > > Are you sure? What I see in the v4.6-rc6 tree seems to indicate that Chelsio > hardware supports large page lists: > > $ git grep -nHw T[34]_MAX_MR_SIZE > drivers/infiniband/hw/cxgb3/cxio_hal.h:58:#define T3_MAX_MR_SIZE > 0x100000000ULL > drivers/infiniband/hw/cxgb3/iwch.c:125: rnicp->attr.max_mr_size = > T3_MAX_MR_SIZE; > drivers/infiniband/hw/cxgb4/provider.c:328: props->max_mr_size = > T4_MAX_MR_SIZE; > drivers/infiniband/hw/cxgb4/t4.h:41:#define T4_MAX_MR_SIZE (~0ULL) > > Bart. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > With the latest patches from Bart, if I keep the max_sectors_kb throttled per the patch, I seem to avoid the "ib_srp: Failed to map data (-12)" messages. Without that patch I can set it to 4M and then immediately will see the failures. Granted I did not leave it running all of last weekend. I will go back to it as I already gave a tested by: Christoph: I think I fixed my mailer per your liking now, I hope so. :) Thanks Laurence -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html