Hi, I saw various machines with Qlogic HBAs having this issue (error code 0x20000 is DID_BUS_BUSY), in my case when using device mapper multipath, the path getting the error was failed by dm-multipath and then reactived because the path checker reported it was up (as it was transient error). It looks like a wrong qla2xxx behavior as reported in this knowledge base: http://kbase.redhat.com/faq/FAQ_46_9001.shtm and also in bug https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=231319 where there's a proposed fix for RHEL4 U6. I tested the workaround proposed in the kbase in a test environment where unfortunately this issue wasn't present and I simulated it forcing an HBA lip with sysfs but with this test the problem didn't disappeared. Maybe your issue is the same. Bye! On Fri, 2007-08-10 at 09:58 -0400, FM wrote: > Hello, > All servers are RHEL 4.5 > SAN is HP EVA 4000 > we are using linux qla modules and multipathd > cluster server have only one FC Card > > > In the dmesg of servers connected to GFS we have a lot of : > SCSI error : <0 0 1 1> return code = 0x20000 > end_request: I/O error, dev sdd, sector 37807111 > > The cluster seems to work fine but I'd like to know if we can avoid this > error. > > here is a multipathd -ll output : > > [root@como ~]# multipath -ll > mpath1 (3600508b4001051e40000900000310000) > [size=500 GB][features="1 queue_if_no_path"][hwhandler="0"] > \_ round-robin 0 [prio=50][active] > \_ 0:0:0:1 sda 8:0 [active][ready] > \_ round-robin 0 [prio=10][enabled] > \_ 0:0:1:1 sdd 8:48 [active][ready] > > mpath3 (3600508b4001051e400009000009e0000) > [size=150 GB][features="1 queue_if_no_path"][hwhandler="0"] > \_ round-robin 0 [prio=50][active] > \_ 0:0:1:2 sde 8:64 [active][ready] > \_ round-robin 0 [prio=10][enabled] > \_ 0:0:0:2 sdb 8:16 [active][ready] > > > > and the device in the multipath.conf > > devices { > device { > vendor "HP " > product "HSV200 " > path_grouping_policy group_by_prio > getuid_callout "/sbin/scsi_id -g -u -s /block/%n" > path_checker tur > path_selector "round-robin 0" > prio_callout "/sbin/mpath_prio_alua %d" > failback immediate > no_path_retry 60 > } > } > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- Simone Gotti
Attachment:
signature.asc
Description: This is a digitally signed message part
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster