Re: Large ib_srp configuration cannot allocate FRM pool space and fails map SCSI host

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 06/29/2017 01:10 PM, Laurence Oberman wrote:
Hello

This issue is apparent on RHEL and upstream 4.12-rc5 (that is what was tested)

Customer has a large configuration so I cannot reproduce this so asking if anybody else is aware of this.

We fail in FMR and we then keep incrementing the SCSI host#'s

Jun 20 14:01:42 xxxxxx kernel: fmr_pool: fmr_create failed for FMR 3809
Jun 20 14:01:42 xxxxxx kernel: fmr_pool: fmr_create failed for FMR 2005
Jun 20 14:01:42 xxxxxx kernel: scsi host7: ib_srp: FMR pool allocation failed (-12) Jun 20 14:01:42 xxxxxx kernel: scsi host8: ib_srp: FMR pool allocation failed (-12)

This repeats over and over.

* 7 pairs of enclosures / each with two controllers
* Each with an expansion tray
* Each side on the controller is going to its own IB switch
* Each node is connected to both switches each running its own subnet manager

Prior versions of RHEL i.e. 7.2 that don't have the newer code that also exists upstream are working.
Both RHEL7.3 and upstream are affected.

The configuration in place is:

for srp.conf
a max_sect=65535,max_cmd_per_lun=254,queue_size=254

for ib_srp
options ib_srp cmd_sg_entries=255 indirect_sg_entries=2048 prefer_fr=N

Failing here

In
struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd             *pd,
                                        struct ib_fmr_pool_param *params)
{
..
..

                         if (IS_ERR(fmr->fmr)) {
pr_warn(PFX "fmr_create failed for FMR %d\n",
                                         i);
                                 kfree(fmr);
                                 goto out_fail;
                         }

Many Thanks
Laurence


Replying to my own post:

Here is the upstream messaging:

Jun 30 11:50:49 xxxxx kernel: ib_srp: mlx4_0: ib_alloc_mr() failed. Try to reduce max_cmd_per_lun, max_sect or ch_count Jun 30 11:50:49 xxxxx kernel: ib_srp: mlx4_0: ib_alloc_mr() failed. Try to reduce max_cmd_per_lun, max_sect or ch_count Jun 30 11:50:49 xxxxx kernel: scsi host8: ib_srp: FR pool allocation failed (-12) Jun 30 11:50:49 xxxxx kernel: ib_srp: mlx4_0: ib_alloc_mr() failed. Try to reduce max_cmd_per_lun, max_sect or ch_count

I have asked them to reduce the current settings and will see if that prevents the allocation failures

I have to say this is a large configuration of rports here and I have never had to reduce this when testing with a smaller configuration.
The 14 available remote storage ports are much larger than I have ever seen.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux