Hi! I faced with weird problem using multipath-tools and IBM DS4300 turbo storage system. | Ctrl A |--ptp fc--| qla2400 HBA-->IBM x460 (first brick) | |DS4300(turbo)| | IBM x460 (dual brick configuration) | | Ctrl B |--ptp fc--| qla2400 HBA-->IBM x460 (second brick)| Operating system: SLES9 SP3 x86 (32-bit) HBA drivers: Native SuSe kernel driver (qla2400) DS4300 target type: Linux (AVT is enabled) The problem: Unpredictable path failures detected by TUR path checker, which is resulted in suspending IO to corresponding filesystem. The failed path is reinstated by multipathd on the next turn tur checker invoked by multipathd (10 sec). If I increase path checking freq (by reducing polling_interval to 2 as it shown in config below), it doesn't help. In fact, it becomes even worse: I faced with situation when all path to LUN were failed by TUR checker at the same time. If not specifying "queue_if_no_path" feature, this leads to IO error reported to upper level (FS). It could work quite good for a day or so, and then *bum*. From DS4300 controller logs I see numerous AVT event happening on various LUNs from time to time. The interesting thing is that, according to Linux logs, the majority of volume transfers were not initiated by multipathd (actually, they were not even detected by multipathd). Another strange thing is that many of AVT transfers ended up on the same controller on which it was started (as it seems to me). I have DS4300 controller log, which is just to big to paste here. What I have already tried: 1) Replace 4G HBA's (qla2400) with 2G HBA's (qla2300). Problem remains. 2) IOZONE tests. Works great. No path failures were detected during tests 3) Play with polling_interval. Didn't help. I have similar configuration working good at some other site. The difference between two installations: 1) Single brick configuration of IBM x460 2) Different HBA type (qla2300) installed in host. 3) One HBA instead of two. 4) There is FC-switch between DS4300 controllers and HBA. 5) RHEL4 U4 x86_64 instead of SLES9 SP3 x86 Questions: 1) What else I can try to resolve this problem? 2) Is it true, that AVT mode could not be used in cluster environment (when two or more nodes are accessing the same LUN's, and thus can trigger AVT) ? 3) Is there any hope (or need) to add RDAC hw handler to dm-multipath? It seems like some part of the work already done by Mike Christie (http://www.redhat.com/archives/dm-devel/2005-October/msg00020.html). Do you have any plans to include this code? Is it in usable state? /var/log/messages ------------- Dec 27 07:30:08 tpc1 multipathd: 8:48: tur checker reports path is down Dec 27 07:30:08 tpc1 multipathd: checker failed path 8:48 in map oradata1 Dec 27 07:30:08 tpc1 kernel: device-mapper: dm-multipath: Failing path 8:48 Dec 27 07:30:08 tpc1 multipathd: 8:96: tur checker reports path is down Dec 27 07:30:08 tpc1 multipathd: checker failed path 8:96 in map oradata1 Dec 27 07:30:08 tpc1 kernel: device-mapper: dm-multipath: Failing path 8:96 Dec 27 07:30:09 tpc1 multipathd: 8:112: tur checker reports path is down Dec 27 07:30:09 tpc1 multipathd: checker failed path 8:112 in map oraredo Dec 27 07:30:09 tpc1 kernel: device-mapper: dm-multipath: Failing path 8:112 Dec 27 07:30:09 tpc1 kernel: Buffer I/O error on device dm-9, logical block 27696 Dec 27 07:30:09 tpc1 kernel: lost page write due to I/O error on dm-9 Dec 27 07:30:09 tpc1 kernel: Aborting journal on device dm-9. Dec 27 07:30:11 tpc1 kernel: ext3_abort called. Dec 27 07:30:11 tpc1 kernel: EXT3-fs abort (device dm-9): ext3_journal_start: Detected aborted journal Dec 27 07:30:11 tpc1 kernel: Remounting filesystem read-only Dec 27 07:30:12 tpc1 multipathd: 8:64: tur checker reports path is down Dec 27 07:30:12 tpc1 kernel: device-mapper: dm-multipath: Failing path 8:64 Dec 27 07:30:12 tpc1 multipathd: checker failed path 8:64 in map oraredo Dec 27 07:30:13 tpc1 multipathd: 8:48: tur checker reports path is up Dec 27 07:30:13 tpc1 multipathd: 8:48: reinstated Dec 27 07:30:13 tpc1 multipathd: oradata1: switch to path group #2 Dec 27 07:30:13 tpc1 multipathd: oradata1: switch to path group #2 Dec 27 07:30:13 tpc1 multipathd: 8:96: tur checker reports path is up Dec 27 07:30:13 tpc1 multipathd: 8:96: reinstated Dec 27 07:30:13 tpc1 multipathd: oradata1: switch to path group #1 Dec 27 07:30:14 tpc1 multipathd: oradata1: switch to path group #1 ------------- multipath.conf : --------------- defaults { udev_dir /dev multipath_tool "/sbin/multipath -v 0 -S" polling_interval 2 default_path_grouping_policy multibus default_getuid_callout "/sbin/scsi_id -g -u -s /block/%n" rr_min_io 100 failback immediate no_path_retry fail } devnode_blacklist { devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" devnode "^hd[a-z][[0-9]*]" devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]" devnode sda devnode fd devnode hd devnode md devnode dm devnode sr devnode scd devnode st devnode ram devnode raw devnode loop } devices { device { vendor "IBM " product "1722-600 " path_grouping_policy group_by_prio path_checker tur path_selector "round-robin 0" prio_callout "/sbin/mpath_prio_tpc /dev/%n" failback immediate rr_min_io 1000 features "1 queue_if_no_path" no_path_retry 300 } } multipaths { multipath { wwid 3600a0b80001ff32a000020c2456bf8a0 alias oradata1 } multipath { wwid 3600a0b80001ff3de000042ba456bfcbc alias oradata2 } multipath { wwid 3600a0b80001ff32a000020c5456bf952 alias oraredo } multipath { wwid 3600a0b80001ff32a000020c7456bf980 alias oraarch1 } multipath { wwid 3600a0b80001ff3de000042bc456bfcf0 alias oraarch2 } } multipath -ll output (with no "queue_if_no_path" feature) --------------------------------------------------------- dm names N dm table oraarch2 N dm table oraarch2 N dm status oraarch2 N dm info oraarch2 O dm table oraredo N dm table oraredo N dm status oraredo N dm info oraredo O dm table oraarch1 N dm table oraarch1 N dm status oraarch1 N dm info oraarch1 O dm table oradata2 N dm table oradata2 N dm status oradata2 N dm info oradata2 O dm table oraarch1p1 N dm table oradata1 N dm table oradata1 N dm status oradata1 N dm info oradata1 O dm table oraarch2p1 N dm table oradata1p1 N dm table oradata2p1 N dm table oraredo1 N oraarch2 (3600a0b80001ff3de000042bc456bfcf0) [size=136 GB][features="0"][hwhandler="0"] \_ round-robin 0 [prio=6][active] \_ 1:0:0:4 sdc 8:32 [active][ready] \_ round-robin 0 [prio=1][enabled] \_ 4:0:0:4 sdk 8:160 [active][ready] oraredo (3600a0b80001ff32a000020c5456bf952) [size=136 GB][features="0"][hwhandler="0"] \_ round-robin 0 [prio=6][active] \_ 3:0:0:1 sdh 8:112 [active][ready] \_ round-robin 0 [prio=1][enabled] \_ 2:0:0:1 sde 8:64 [active][ready] oraarch1 (3600a0b80001ff32a000020c7456bf980) [size=136 GB][features="0"][hwhandler="0"] \_ round-robin 0 [prio=6][active] \_ 2:0:0:2 sdf 8:80 [active][ready] \_ round-robin 0 [prio=1][enabled] \_ 3:0:0:2 sdi 8:128 [active][ready] oradata2 (3600a0b80001ff3de000042ba456bfcbc) [size=817 GB][features="0"][hwhandler="0"] \_ round-robin 0 [prio=6][active] \_ 4:0:0:3 sdj 8:144 [active][ready] \_ round-robin 0 [prio=1][enabled] \_ 1:0:0:3 sdb 8:16 [active][ready] oradata1 (3600a0b80001ff32a000020c2456bf8a0) [size=681 GB][features="0"][hwhandler="0"] \_ round-robin 0 [prio=6][enabled] \_ 3:0:0:0 sdg 8:96 [active][ready] \_ round-robin 0 [prio=1][enabled] \_ 2:0:0:0 sdd 8:48 [active][ready] Best Regards, Yury.
Attachment:
pgpW91pDB2v3G.pgp
Description: PGP signature
-- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel