Re: [PATCH V4 3/3] scsi: core: avoid to pre-allocate big chunk for sg list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Apr 28, 2019 at 03:39:32PM +0800, Ming Lei wrote:
> Now scsi_mq_setup_tags() pre-allocates a big buffer for IO sg list,
> and the buffer size is scsi_mq_sgl_size() which depends on smaller
> value between shost->sg_tablesize and SG_CHUNK_SIZE.
> 
> Modern HBA's DMA is often capable of deadling with very big segment
> number, so scsi_mq_sgl_size() is often big. Suppose the max sg number
> of SG_CHUNK_SIZE is taken, scsi_mq_sgl_size() will be 4KB.
> 
> Then if one HBA has lots of queues, and each hw queue's depth is
> high, pre-allocation for sg list can consume huge memory.
> For example of lpfc, nr_hw_queues can be 70, each queue's depth
> can be 3781, so the pre-allocation for data sg list is 70*3781*2k
> =517MB for single HBA.
> 
> There is Red Hat internal report that scsi_debug based tests can't
> be run any more since legacy io path is killed because too big
> pre-allocation.
> 
> So switch to runtime allocation for sg list, meantime pre-allocate 2
> inline sg entries. This way has been applied to NVMe PCI for a while,
> so it should be fine for SCSI too. Also runtime sg entries allocation
> has verified and run always in the original legacy io path.
> 
> Not see performance effect in my big BS test on scsi_debug.
> 

This patch causes a variety of boot failures in -next. Typical failure
pattern is scsi hangs or failure to find a root file system. For example,
on alpha, trying to boot from usb:

scsi 0:0:0:0: Direct-Access     QEMU     QEMU HARDDISK    2.5+ PQ: 0 ANSI: 5
sd 0:0:0:0: Power-on or device reset occurred
sd 0:0:0:0: [sda] 20480 512-byte logical blocks: (10.5 MB/10.0 MiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 0:0:0:0: [sda] Attached SCSI disk
usb 1-1: reset full-speed USB device number 2 using ohci-pci
sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=0x00
sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 00 58 00 00 42 00
print_req_error: I/O error, dev sda, sector 88 flags 80000
EXT4-fs error (device sda): __ext4_get_inode_loc:4703: inode #2: block 44: comm
swapper: unable to read itable block
EXT4-fs (sda): get root inode failed
EXT4-fs (sda): mount failed
VFS: Cannot open root device "sda" or unknown-block(8,0): error -5

Similar problems are seen with other architectures.
Reverting the patch fixes the problem. Bisect log attached.

Guenter

---
# bad: [ae3cad8f39ccf8d31775d9737488bccf0e44d370] Add linux-next specific files for 20190603
# good: [f2c7c76c5d0a443053e94adb9f0918fa2fb85c3a] Linux 5.2-rc3
git bisect start 'HEAD' 'v5.2-rc3'
# good: [8ff6f4c6e067a9d3f3bbacf02c4ea5eb81fe2c6a] Merge remote-tracking branch 'crypto/master'
git bisect good 8ff6f4c6e067a9d3f3bbacf02c4ea5eb81fe2c6a
# good: [6c93755861ce6a6dd904df9cdae9f08671132dbe] Merge remote-tracking branch 'iommu/next'
git bisect good 6c93755861ce6a6dd904df9cdae9f08671132dbe
# good: [1a567956cb3be5754d94ce9380a2151e57e204a7] Merge remote-tracking branch 'cgroup/for-next'
git bisect good 1a567956cb3be5754d94ce9380a2151e57e204a7
# bad: [a6878ca73cf30b83efbdfb1ecc443d7cfb2d8193] Merge remote-tracking branch 'rtc/rtc-next'
git bisect bad a6878ca73cf30b83efbdfb1ecc443d7cfb2d8193
# bad: [25e7f57cf9f86076704b543628a0f02d3d733726] Merge remote-tracking branch 'gpio/for-next'
git bisect bad 25e7f57cf9f86076704b543628a0f02d3d733726
# bad: [28e85db94534c92fa00d787264e3ea1843d1dc42] scsi: lpfc: Fix nvmet handling of received ABTS for unmapped frames
git bisect bad 28e85db94534c92fa00d787264e3ea1843d1dc42
# bad: [cf9eddf616bb03387e597629142b0be34111e8d0] scsi: hpsa: check for tag collision
git bisect bad cf9eddf616bb03387e597629142b0be34111e8d0
# good: [4e74166c52a836f05d4bd8270835703908b34d3e] scsi: libsas: switch sas_ata.[ch] to SPDX tags
git bisect good 4e74166c52a836f05d4bd8270835703908b34d3e
# good: [f186090846c29fd9760917fb3d01f095c39262e0] scsi: lib/sg_pool.c: improve APIs for allocating sg pool
git bisect good f186090846c29fd9760917fb3d01f095c39262e0
# bad: [12b6b55806928690359bb21de3a19199412289fd] scsi: sd: Inline sd_probe_part2()
git bisect bad 12b6b55806928690359bb21de3a19199412289fd
# bad: [c3288dd8c2328a74cb2157dff18d13f6a75cedd2] scsi: core: avoid pre-allocating big SGL for data
git bisect bad c3288dd8c2328a74cb2157dff18d13f6a75cedd2
# good: [0f0e744eae6c8af361d227d3a2286973351e5a2a] scsi: core: avoid pre-allocating big SGL for protection information
git bisect good 0f0e744eae6c8af361d227d3a2286973351e5a2a
# first bad commit: [c3288dd8c2328a74cb2157dff18d13f6a75cedd2] scsi: core: avoid pre-allocating big SGL for data



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux