Re: [PATCH 11/11] IB/srp: Prevent mapping failures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




----- Original Message -----
> From: "Bart Van Assche" <bart.vanassche@xxxxxxxxxxx>
> To: "Christoph Hellwig" <hch@xxxxxx>
> Cc: "Doug Ledford" <dledford@xxxxxxxxxx>, "Sagi Grimberg" <sagi@xxxxxxxxxxx>, "Laurence Oberman"
> <loberman@xxxxxxxxxx>, linux-rdma@xxxxxxxxxxxxxxx
> Sent: Tuesday, May 3, 2016 5:13:32 PM
> Subject: Re: [PATCH 11/11] IB/srp: Prevent mapping failures
> 
> On 05/03/2016 02:33 AM, Christoph Hellwig wrote:
> > On Fri, Apr 22, 2016 at 02:16:31PM -0700, Bart Van Assche wrote:
> >> If both max_sectors and the queue_depth are high enough it can
> >> happen that the MR pool is depleted temporarily. This causes
> >> the SRP initiator to report mapping failures. Although the SRP
> >> initiator recovers from such mapping failures, prevent that
> >> this can happen by limiting max_sectors.
> > 
> > FYI, even with this patch I see tons of errors like:
> > 
> > [ 2237.161106] scsi host7: ib_srp: Failed to map data (-12)
> 
> That's unintended. I can reproduce this and will analyze this further.
>  
> >> +		/*
> >> +		 * FR and FMR can only map one HCA page per entry. If the
> >> +		 * start address is not aligned on a HCA page boundary two
> >> +		 * entries will be used for the head and the tail although
> >> +		 * these two entries combined contain at most one HCA page of
> >> +		 * data. Hence the "- 1" in the calculation below.
> >> +		 */
> >> +		max_max_sectors = (srp_dev->max_pages_per_mr - 1) <<
> >> +				  (ilog2(srp_dev->mr_page_size) - 9);
> >> +		if (target->scsi_host->max_sectors > max_max_sectors) {
> >> +			shost_printk(KERN_WARNING, target->scsi_host,
> >> +				     PFX "Reducing max_sectors from %d to %d\n",
> >> +				     target->scsi_host->max_sectors,
> >> +				     max_max_sectors);
> >> +			target->scsi_host->max_sectors = max_max_sectors;
> >> +		}
> > 
> > I don't think there is any good reason to printk a warning here -
> > limited hardware is a totally normal thing.  E.g. if we merge
> > your RDMA/CM support and someone runs SRP on chelsio hardware they'd
> > probably hit this all the time..
> 
> Are you sure? What I see in the v4.6-rc6 tree seems to indicate that Chelsio
> hardware supports large page lists:
> 
> $ git grep -nHw T[34]_MAX_MR_SIZE
> drivers/infiniband/hw/cxgb3/cxio_hal.h:58:#define T3_MAX_MR_SIZE
> 0x100000000ULL
> drivers/infiniband/hw/cxgb3/iwch.c:125:	rnicp->attr.max_mr_size =
> T3_MAX_MR_SIZE;
> drivers/infiniband/hw/cxgb4/provider.c:328:	props->max_mr_size =
> T4_MAX_MR_SIZE;
> drivers/infiniband/hw/cxgb4/t4.h:41:#define T4_MAX_MR_SIZE (~0ULL)
> 
> Bart.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

With the latest patches from Bart, if I keep the max_sectors_kb throttled per the patch, 
I seem to avoid the "ib_srp: Failed to map data (-12)" messages.
Without that patch I can set it to 4M and then immediately will see the failures.
Granted I did not leave it running all of last weekend.

I will go back to it as I already gave a tested by:

Christoph: I think I fixed my mailer per your liking now, I hope so. :)

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux