[Bug 79901] New: Extremely slow boot on Promise VTrak E610f due to sd_mod RSOC usage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=79901

            Bug ID: 79901
           Summary: Extremely slow boot on Promise VTrak E610f due to
                    sd_mod RSOC usage
           Product: IO/Storage
           Version: 2.5
    Kernel Version: 3.14.7
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: SCSI
          Assignee: linux-scsi@xxxxxxxxxxxxxxx
          Reporter: rraptorr@xxxxxxxxxxxx
        Regression: No

Recently I've started upgrading all of my machines to kernel 3.14 (from Debian
wheezy backports to be precise). Mostly there were not problems, but I've
stumbled upon weird behavior on Fibre Channel servers (QLogic cards inside HP
blades) using Promise VTrak E610f arrays.

As soon as SCSI subsystem tries to detect partitions a lot of SCSI errors are
reported. The system stalls (but initramfs is responsive) for about 20-30
minutes (depending on number of arrays and LUNs). After that time, disk
detection finishes and system continues booting as usual. Everything works
perfectly afterwards.

I've spent some time fiddling with qla2xxx driver versions, SCSI scanning
options and anything else I could think of. Finally, I was able to find
culprit. The problem lies in sd_mod usage of scsi_report_opcode(). This
function is used to determine if the disk supports WRITE SAME command. It does
so by issuing REPORT SUPPORTED OPERATION CODES command. Unfortunately, it seems
Promise VTrak E610f really, really does not like RSOC. As soon as RSOC is
issued the array stalls for a while, then kernel tries to abort the command and
finally it must reset the port. Fortunately the array starts working again
after the reset. I've also verified this behavior with sg_opcodes utility.

Commit that introduced RSOC usage in sd_mod:
http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=98dcc2946adbe4349ef1ef9b99873b912831edd4
Removing it fixes the issue.

I'm not sure what is the correct way to fix this as I'm not very familiar with
SCSI spec. If RSOC support cannot be reliably determined then probably some
kind of blacklist should be introduced.

As a workaround, I've modified qla2xxx driver to set 'no_write_same' flag.
While not directly related it forces sd_mod not to issue RSOC and it is easier
for me to ship Debian package with modified single driver (I'd prefer not to
manage my own kernel packages).

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux