I've got a Sun StorageTek 6140 (an Engenio 3994 with a Sun sticker), which is an RDAC active/passive array. When I configure it to use RDAC mode and set up my host (2.6.23-rc3, x86_64) to use the rdac hardware handler, it doesn't really seem to work. I yank the two paths to the active controller using "echo 1 > /sys/block/$dev/device/delete" while generating heavy I/O and wait for a pg switch to happen, but all I get in the logs is the following: Aug 27 16:17:45 atalanta multipathd: sdj: remove path (uevent) Aug 27 16:17:45 atalanta kernel: [264107.296729] sd 4:0:1:1: [sdj] Synchronizing SCSI cache Aug 27 16:17:45 atalanta kernel: [264107.296894] device-mapper: multipath rdac: using RDAC command with timeout 6000 Aug 27 16:17:45 atalanta kernel: [264107.297271] device-mapper: multipath: Failing path 8:144. Aug 27 16:17:47 atalanta multipathd: mysql: load table [0 41943040 multipath 0 1 rdac 2 1 round-robin 0 1 1 8:96 1000 round-robin 0 2 1 8:64 1000 8:112 1000] Aug 27 16:18:10 atalanta multipathd: sdg: remove path (uevent) Aug 27 16:18:10 atalanta kernel: [264131.733301] sd 3:0:1:1: [sdg] Synchronizing SCSI cache Aug 27 16:18:10 atalanta kernel: [264131.733435] device-mapper: multipath rdac: using RDAC command with timeout 6000 Aug 27 16:18:10 atalanta kernel: [264131.759010] device-mapper: multipath: Failing path 8:96. Aug 27 16:18:10 atalanta kernel: [264131.829770] device-mapper: multipath rdac: queueing MODE_SELECT command on 8:64 Aug 27 16:19:12 atalanta kernel: [264193.753980] device-mapper: multipath: Failing path 8:64. Aug 27 16:19:12 atalanta kernel: [264193.754944] device-mapper: multipath rdac: queueing MODE_SELECT command on 8:112 Aug 27 16:20:14 atalanta kernel: [264255.632988] device-mapper: multipath: Failing path 8:112. Aug 27 16:20:14 atalanta kernel: [264255.633032] printk: 12 messages suppressed. Aug 27 16:20:14 atalanta kernel: [264255.633035] Buffer I/O error on device dm-4, logical block 1821127 Aug 27 16:20:14 atalanta kernel: [264255.670535] lost page write due to I/O error on dm-4 Aug 27 16:20:14 atalanta kernel: [264255.670545] Buffer I/O error on device dm-4, logical block 1821128 Aug 27 16:20:14 atalanta kernel: [264255.708064] lost page write due to I/O error on dm-4 Aug 27 16:20:14 atalanta kernel: [264255.708070] Buffer I/O error on device dm-4, logical block 1821129 Aug 27 16:20:14 atalanta kernel: [264255.745599] lost page write due to I/O error on dm-4 Aug 27 16:20:14 atalanta kernel: [264255.745608] Buffer I/O error on device dm-4, logical block 1821130 Aug 27 16:20:14 atalanta kernel: [264255.783080] lost page write due to I/O error on dm-4 Aug 27 16:20:14 atalanta kernel: [264255.783089] Buffer I/O error on device dm-4, logical block 1821131 Aug 27 16:20:14 atalanta kernel: [264255.820614] lost page write due to I/O error on dm-4 Aug 27 16:20:14 atalanta kernel: [264255.820622] Buffer I/O error on device dm-4, logical block 1821132 Aug 27 16:20:14 atalanta kernel: [264255.858158] lost page write due to I/O error on dm-4 Aug 27 16:20:14 atalanta kernel: [264255.858165] Buffer I/O error on device dm-4, logical block 1821133 Aug 27 16:20:14 atalanta kernel: [264255.895633] lost page write due to I/O error on dm-4 Aug 27 16:20:14 atalanta kernel: [264255.895639] Buffer I/O error on device dm-4, logical block 1821134 Aug 27 16:20:14 atalanta kernel: [264255.933099] lost page write due to I/O error on dm-4 Aug 27 16:20:14 atalanta kernel: [264255.933105] Buffer I/O error on device dm-4, logical block 1821135 Aug 27 16:20:14 atalanta kernel: [264255.970556] lost page write due to I/O error on dm-4 Aug 27 16:20:14 atalanta kernel: [264255.970562] Buffer I/O error on device dm-4, logical block 1821136 Aug 27 16:20:14 atalanta kernel: [264256.008010] lost page write due to I/O error on dm-4 Aug 27 16:20:14 atalanta kernel: [264256.061324] Aborting journal on device dm-4. Aug 27 16:20:14 atalanta kernel: [264256.088223] journal commit I/O error Aug 27 16:20:14 atalanta kernel: [264256.113352] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113383] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113401] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113405] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113409] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113413] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113418] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113422] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113425] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113477] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113541] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113547] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113557] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113565] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113583] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.113771] __journal_remove_journal_head: freeing b_committed_data Aug 27 16:20:14 atalanta kernel: [264256.114328] WARNING: at fs/buffer.c:1154 mark_buffer_dirty() Aug 27 16:20:14 atalanta kernel: [264256.114329] Aug 27 16:20:14 atalanta kernel: [264256.114330] Call Trace: Aug 27 16:20:14 atalanta kernel: [264256.114347] [mark_buffer_dirty+117/144] mark_buffer_dirty+0x75/0x90 Aug 27 16:20:14 atalanta kernel: [264256.114359] [_end+128966711/2129771920] :ext3:ext3_commit_super+0x57/0xa0 Aug 27 16:20:14 atalanta kernel: [264256.114364] [freeze_bdev+138/176] freeze_bdev+0x8a/0xb0 Aug 27 16:20:14 atalanta kernel: [264256.114374] [_end+128731225/2129771920] :dm_mod:dm_suspend+0x119/0x450 Aug 27 16:20:14 atalanta kernel: [264256.114379] [default_wake_function+0/16] default_wake_function+0x0/0x10 Aug 27 16:20:14 atalanta kernel: [264256.114383] [__up_write+49/368] __up_write+0x31/0x170 Aug 27 16:20:14 atalanta kernel: [264256.114390] [_end+128744272/2129771920] :dm_mod:dev_suspend+0x0/0x210 Aug 27 16:20:14 atalanta kernel: [264256.114396] [_end+128744602/2129771920] :dm_mod:dev_suspend+0x14a/0x210 Aug 27 16:20:14 atalanta kernel: [264256.114402] [_end+128742063/2129771920] :dm_mod:ctl_ioctl+0x1df/0x2e0 Aug 27 16:20:14 atalanta kernel: [264256.114406] [do_wp_page+964/1280] do_wp_page+0x3c4/0x500 Aug 27 16:20:14 atalanta kernel: [264256.114416] [do_ioctl+125/192] do_ioctl+0x7d/0xc0 Aug 27 16:20:14 atalanta kernel: [264256.114418] [vfs_ioctl+116/720] vfs_ioctl+0x74/0x2d0 Aug 27 16:20:14 atalanta kernel: [264256.114421] [sys_ioctl+149/176] sys_ioctl+0x95/0xb0 Aug 27 16:20:14 atalanta kernel: [264256.114425] [system_call+126/131] system_call+0x7e/0x83 Aug 27 16:20:14 atalanta kernel: [264256.114428] Aug 27 16:20:14 atalanta kernel: [264256.128587] device-mapper: multipath rdac: queueing MODE_SELECT command on 8:64 Aug 27 16:21:16 atalanta kernel: [264318.014646] device-mapper: multipath: Failing path 8:64. Aug 27 16:21:16 atalanta kernel: [264318.014987] device-mapper: multipath rdac: queueing MODE_SELECT command on 8:112 Aug 27 16:22:18 atalanta kernel: [264379.906125] device-mapper: multipath: Failing path 8:112. Aug 27 16:22:18 atalanta kernel: [264379.906159] printk: 22947 messages suppressed. Aug 27 16:22:18 atalanta kernel: [264379.906163] Buffer I/O error on device dm-4, logical block 0 Aug 27 16:22:18 atalanta kernel: [264379.940530] lost page write due to I/O error on dm-4 Aug 27 16:22:18 atalanta kernel: [264379.940562] ext3_abort called. Aug 27 16:22:18 atalanta multipathd: mysql: load table [0 41943040 multipath 0 1 rdac 1 1 round-robin 0 2 1 8:64 1000 8:112 1000] Aug 27 16:22:18 atalanta kernel: [264379.959340] EXT3-fs error (device dm-4): ext3_journal_start_sb: Detected aborted journal Aug 27 16:22:18 atalanta kernel: [264380.008425] Remounting filesystem read-only Aug 27 16:22:18 atalanta kernel: [264380.035388] device-mapper: multipath rdac: queueing MODE_SELECT command on 8:112 Aug 27 16:23:20 atalanta kernel: [264441.923876] device-mapper: multipath: Failing path 8:112. Aug 27 16:23:20 atalanta multipathd: 8:112: mark as failed Aug 27 16:23:20 atalanta multipathd: mysql: remaining active paths: 1 Aug 27 16:23:20 atalanta kernel: [264441.924715] device-mapper: multipath rdac: queueing MODE_SELECT command on 8:64 Aug 27 16:23:30 atalanta multipathd: 8:112: tur checker reports path is up Aug 27 16:23:30 atalanta multipathd: 8:112: reinstated Aug 27 16:23:30 atalanta multipathd: mysql: remaining active paths: 2 Aug 27 16:24:22 atalanta kernel: [264503.821431] device-mapper: multipath: Failing path 8:64. Aug 27 16:24:22 atalanta multipathd: 8:64: mark as failed Aug 27 16:24:22 atalanta multipathd: mysql: remaining active paths: 1 Aug 27 16:24:22 atalanta kernel: [264503.822237] device-mapper: multipath rdac: queueing MODE_SELECT command on 8:112 Aug 27 16:24:32 atalanta multipathd: 8:64: tur checker reports path is up Aug 27 16:24:32 atalanta multipathd: 8:64: reinstated Aug 27 16:24:32 atalanta multipathd: mysql: remaining active paths: 2 Aug 27 16:25:24 atalanta kernel: [264565.701064] device-mapper: multipath: Failing path 8:112. Aug 27 16:25:24 atalanta kernel: [264565.701395] device-mapper: multipath rdac: queueing MODE_SELECT command on 8:64 Aug 27 16:25:24 atalanta multipathd: 8:112: mark as failed Aug 27 16:25:24 atalanta multipathd: mysql: remaining active paths: 1 Aug 27 16:25:35 atalanta multipathd: 8:112: tur checker reports path is up Aug 27 16:25:35 atalanta multipathd: 8:112: reinstated Aug 27 16:25:35 atalanta multipathd: mysql: remaining active paths: 2 Aug 27 16:26:26 atalanta kernel: [264627.593359] device-mapper: multipath: Failing path 8:64. Aug 27 16:26:26 atalanta kernel: [264627.594446] device-mapper: multipath rdac: queueing MODE_SELECT command on 8:112 Aug 27 16:26:26 atalanta multipathd: 8:64: mark as failed Aug 27 16:26:26 atalanta multipathd: mysql: remaining active paths: 1 Aug 27 16:26:37 atalanta multipathd: 8:64: tur checker reports path is up Aug 27 16:26:37 atalanta multipathd: 8:64: reinstated Aug 27 16:26:37 atalanta multipathd: mysql: remaining active paths: 2 Aug 27 16:27:28 atalanta kernel: [264689.478050] device-mapper: multipath: Failing path 8:112. Aug 27 16:27:28 atalanta kernel: [264689.478771] device-mapper: multipath rdac: queueing MODE_SELECT command on 8:64 Aug 27 16:27:28 atalanta multipathd: 8:112: mark as failed Aug 27 16:27:28 atalanta multipathd: mysql: remaining active paths: 1 Aug 27 16:27:39 atalanta multipathd: 8:112: tur checker reports path is up Aug 27 16:27:39 atalanta multipathd: 8:112: reinstated Aug 27 16:27:39 atalanta multipathd: mysql: remaining active paths: 2 Aug 27 16:28:30 atalanta kernel: [264751.365417] device-mapper: multipath: Failing path 8:64. Aug 27 16:28:30 atalanta kernel: [264751.366649] device-mapper: multipath rdac: queueing MODE_SELECT command on 8:112 Aug 27 16:28:30 atalanta multipathd: 8:64: mark as failed Aug 27 16:28:30 atalanta multipathd: mysql: remaining active paths: 1 Aug 27 16:28:41 atalanta multipathd: 8:64: tur checker reports path is up Aug 27 16:28:41 atalanta multipathd: 8:64: reinstated Aug 27 16:28:41 atalanta multipathd: mysql: remaining active paths: 2 The last messages appear to be looping forever. The passive controller never goes active, and there's nothing in the logs on the array that indicates any attempt to move the volume. It seems like the hardware handler is simply broken... Another quite nasty thing is that even though I'm using queue_if_no_path, I/O errors still made it up to the file system layer, making it read-only. Isn't that exactly what queue_if_no_path is supposed to prevent? Regards -- Tore Anderson -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel