>Hi all, > >I have a two-nodes cluster. Everytime I shutdown one of the cluster nodes, the console of the other node prints out these errors: >SCSI error : <1 0 1 1> return code = 0x20000 >end_request: I/O error, dev sde, sector 33433224 >device-mapper: dm-multipath: Failing path 8:64. >SCSI error : <1 0 1 1> return code = 0x20000 >end_request: I/O error, dev sde, sector 33886548 >SCSI error : <1 0 1 1> return code = 0x20000 [...] > >and services on that node do a restart. >The topology is as follows: >2 SunFire X4200 Servers, each equipped with 2 Qlogic (Sun) HBAs which lspci show as: >Fibre Channel: QLogic Corp. QLA6322 Fibre Channel Adapter (rev 03) > >Connected via two FC Switches SANBOX2 (always from qlogic) to a Sun StorEdge 3510 RAID Array. > >The cluster configuration is made of a mail service which mounts three GFS filesystems, then starts postfix and courier-imap. > >It seems that the problem is when the qlogic driver (qla6312) gets loaded-unloaded. I managed to reproduce the problem doing a modprobe -r qla6312 / >modprobe qla6312: immediately the other node starts whit scsi errors until GFS filesystems hang and are whitdraw. > >Any idea if this can be a GFS fault or only a matter of drivers? and if the latter, which mailing list should I post for it? > >Thanks in advance > >-- >Claudio Tassini Hi Claudio, Have you set any kind of zoning in the FC switches? Sometimes a driver may issue target resets to every single FC port it sees in the SAN and create this kind of trouble. The current recommendation from most vendors is to setup a zone for each HBA + storage port combination, i.e. server1_hba1_arrayport1, server1_hba1_arrayport2 and so on. If each HBA is isolated in its own zone, it should never affect other HBAs. And if after setting up the zoning you still have the problem, it is very likely a FC switch issue. Hope this helps, Javier Peña -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster