multipath bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I have been fighting with a RHEL 6.2 fail over problem I have hit during rolling upgrades, and I was wondering if anyone else has seen this. On losing IO paths the initiator locks up (ssh locks up etc), I see the following in syslog:

Aug 1 15:10:15 ashe kernel: INFO: task simpled:15450 blocked for more than 120 seconds. Aug 1 15:10:15 ashe kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 1 15:10:15 ashe kernel: simpled D 0000000000000007 0 15450 15424 0x00000080 Aug 1 15:10:15 ashe kernel: ffff880405589a98 0000000000000082 0000000000000000 ffffffffa00041fc Aug 1 15:10:15 ashe kernel: ffff880406696378 ffff880409fb4400 0000000000000001 000000000000000c Aug 1 15:10:15 ashe kernel: ffff880405f93058 ffff880405589fd8 000000000000fb88 ffff880405f93058
Aug  1 15:10:15 ashe kernel: Call Trace:
Aug 1 15:10:15 ashe kernel: [<ffffffffa00041fc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod]
Aug  1 15:10:15 ashe kernel: [<ffffffff814fe0f3>] io_schedule+0x73/0xc0
Aug 1 15:10:15 ashe kernel: [<ffffffff811b676e>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90 Aug 1 15:10:15 ashe kernel: [<ffffffff811b6c5e>] __blockdev_direct_IO+0x5e/0xd0 Aug 1 15:10:15 ashe kernel: [<ffffffff811b3510>] ? blkdev_get_blocks+0x0/0xc0
Aug  1 15:10:15 ashe kernel: [<ffffffff811b4377>] blkdev_direct_IO+0x57/0x60
Aug 1 15:10:15 ashe kernel: [<ffffffff811b3510>] ? blkdev_get_blocks+0x0/0xc0 Aug 1 15:10:15 ashe kernel: [<ffffffff81114e62>] generic_file_direct_write+0xc2/0x190 Aug 1 15:10:15 ashe kernel: [<ffffffff81116675>] __generic_file_aio_write+0x345/0x480
Aug  1 15:10:15 ashe kernel: [<ffffffff811b4e00>] ? blkdev_open+0x0/0xc0
Aug  1 15:10:15 ashe kernel: [<ffffffff811b3b0c>] blkdev_aio_write+0x3c/0xa0
Aug  1 15:10:15 ashe kernel: [<ffffffff8117ae9a>] do_sync_write+0xfa/0x140
Aug  1 15:10:15 ashe kernel: [<ffffffff8118c2f0>] ? do_filp_open+0x780/0xd60
Aug 1 15:10:15 ashe kernel: [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 Aug 1 15:10:15 ashe kernel: [<ffffffff81213266>] ? security_file_permission+0x16/0x20
Aug  1 15:10:15 ashe kernel: [<ffffffff8117b198>] vfs_write+0xb8/0x1a0
Aug 1 15:10:15 ashe kernel: [<ffffffff810d6b12>] ? audit_syscall_entry+0x272/0x2a0
Aug  1 15:10:15 ashe kernel: [<ffffffff8117bbb1>] sys_write+0x51/0x90
Aug 1 15:10:15 ashe kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b Aug 1 15:12:02 ashe init: tty (/dev/tty1) main process (2774) killed by TERM signal
Aug  1 15:12:03 ashe avahi-daemon[2291]: Got SIGTERM, quitting.

I have updated the following packages to the latest available from RedHat but the problem still presists:

device-mapper-1.02.74-10.el6.x86_64
device-mapper-multipath-0.4.9-56.el6_3.1.x86_64
kernel-2.6.32-279.2.1.el6.x86_64
lvm2-2.02.95-10.el6.x86_64

Does anyone have any suggestions/workarounds? I am looking at the source myself but I am not familiar with dm.

Please advise.
Karan


--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel


[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux