Re: mpt3sas fails to allocate budget_map and detects no devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 06, 2022 at 10:26:01AM +0000, Martin Wilck wrote:
> On Thu, 2022-01-06 at 11:03 +0800, Ming Lei wrote:
> > On Wed, Jan 05, 2022 at 06:00:41PM +0000, Martin Wilck wrote:
> > > Hello Ming, Sreekanth,
> > > 
> > > I'm observing a problem where mpt3sas fails to allocate the
> > > budget_map
> > > for any SCSI device, because attempted allocation is larger than
> > > the
> > > maximum possible. The issue is caused by the logic used in
> > > 020b0f0a3192
> > > ("scsi: core: Replace sdev->device_busy with sbitmap") 
> > > to calculate the bitmap size. This is observed with 5.16-rc8.
> > > 
> > > The controller at hand has properties can_queue=29865 and
> > > cmd_per_lun=7. The way these parameters are used in
> > > scsi_alloc_sdev()->
> > 
> > That two parameter looks bad, can_queue is too big, however
> > cmd_per_lun
> > is so small.
> 
> Right. @Sreekanth, can you comment on that?
> 
> > > sbitmap_init_node(), this results in an sbitmap with 29865 maps,
> > > where
> > > only a single bit is used per map. On x86_64, this results in an
> > > attempt to allocate 29865 * 192 =  5734080 bytes for the sbitmap,
> > > which
> > > is larger than  PAGE_SIZE * (1 << (MAX_ORDER - 1)), and fails.
> > 
> > Bart has posted one patch for fixing the issue:
> > 
> > https://lore.kernel.org/linux-scsi/20211203231950.193369-2-bvanassche@xxxxxxx/
> > 
> > but it isn't merged yet.
> 
> Thanks a lot for pointing that out. I had a faint remembrance of it but
> failed to locate it yesterday.
> 
> This fixes the allocation failure, but not the fact that for
> cmd_per_lun = 7 (hard-coded in mpt3sas) only a single bit per sbitmap
> is used and the resulting relation between allocated and used memory is
> ridiculous. We'd still allocate 200kiB or 48 contiguous pages, out of
> which no more than 2048 bits / 256 bytes would be used (*); iow at
> least 99.87% of the allocated memory would be wasted. In the default
> case (queue depth left unchanged), we'd use only 2 bytes, and waste
> 99.9989%.

The problem is that mpt3sas uses very weird .can_queue and .cmd_per_lun,
which shouldn't be common.

If .cmd_per_lun is too small, we have to spread the bits among as much
as possible cache lines for avoiding cache line ping-pong, and this kind
of sbitmap principle isn't wrong.

Also we may reduce max queue depth of 1024 too, usually we don't need so
big queue depth to saturate one SSD, and the number should be based on
nvme's setting.

> 
> When calculating the sbitmap shift, would you agree that it makes sense
> to start with the desired number of separate cache lines, as my
> proposed patch did? The core sbitmap code assumes that 4-8 separate
> cache lines are a reasonable value for moderate (4 <= d <= 512) bitmap
> depth. I believe we should aim for that in the SCSI code, too.

Here the actual depth is .cmd_per_lun(7) which is too small, so each
bit may have to consume one standalone cache line.

> Admittedly, the backside of that is that in the default case (queue
> depth unchaged), only a single cache line would be used in the mpt3sas
> scenario.
> 
> Alternatively, we could inhibit increasing the device queue depth above
> a certain multiple of cmd_per_lun, and size the sbitmap by that limit.
> My gut feeling says that if cmd_per_lun == 7, it makes sense to use a
> limit of 32. That way the bitmap would fit into 2 pages; we'd still
> waste a lot, but it wouldn't matter much in absolute numbers. 
> Thus we could forbid increasing the queue depth to more than the power
> of 2 above 4*cmd_per_lun. Does this make sense?

I'd suggest to fix mpt3sas for avoiding this memory waste.

> 
> Regards
> Martin
> 
> (*) this calculation ignores the use of sb->map[i].depth. Taking it
> into account wouldn't change much.

Yeah, I have actually one patch to remove sb->map[].depth, which can
reduce each map's size by 1/3.



Thanks,
Ming




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux