Incompatibility of internal bitmap with ext4 barriers?

Jason Tinker <jsntinker@xxxxxxxxx> · Tue, 17 May 2011 16:10:07 +0400

Hello

I have encountered a weird bug when trying to use ext4 partition on
top of mdadm RAID5 array. Mdadm array has internal bitmap, and ext4 is
mounted with default options - which means that barriers are enabled.
When trying to write large enough amount of files system just locks up
indefinitely, only hard reset helps. This seems to happen at random
times, yet consistently after several minutes of usage.
I tried different configuration options and it seems to happen only
when both barriers on ext4 and mdadm's internal bitmaps are enabled.
After disabling ext4 barriers for good (it seemed like a lesser evil)
no lock ups happened for 3 months.
Mdadm is version 3.1.3, kernel is 2.6.32 (rhel6)

Initialy it happened on this configuration:
1 RAID0 of 2x1TB drives + 1 RAID0 of 2x1TB drives +1x2TB drive
Each array had 1 hdd on PCI SATA (SiliconImage) controller and 1 on
internal ICH7 (Intel G41 chipset), 2Tb drive was on PCI SATA
controller.

Later I successfully reproduced the same bug in a test setup with all
partitions on a single drive.
The following steps where taken:
1. Created 6 identical blank partitions (I made 6*1GB partitions on a
single hdd).
2. Created 3 RAID0 arrays from these partitions: 0&1, 2&3, 4&5.
3. Created MBR and blank primary partition on each of these arrays
using fdisk (this step is
probably optional).
4. Created 1 RAID5 array from these 3 partitions with --bitmap=internal,
everything else default.
5. Created ext4 filesystem on RAID5 with default options, mount with default
options (barriers are enabled by default).
6. Tried to rsync several hundred of ~1-20 MB files to mounted directory.

/var/log/messages at the moment of lock up:

kernel: INFO: task md90_raid5:13736 blocked for more than 120 seconds.
kernel: "echo 0 > /proc/sys/kernel/hung_task_
timeout_secs" disables this message.
kernel: md90_raid5    D 0000000000000000     0 13736      2 0x00000080
kernel: ffff88003e0b7ab0 0000000000000046 ffff88006d273e60 ffff8800716b8240
kernel: ffff88003e0b7a50 ffffffff8123a553 ffff880072ea4800 0000000000000810
kernel: ffff88006f50ba98 ffff88003e0b7fd8 0000000000010518 ffff88006f50ba98
kernel: Call Trace:
kernel: [<ffffffff8123a553>] ? elv_insert+0x133/0x1f0
kernel: [<ffffffff810920ce>] ? prepare_to_wait+0x4e/0x80
kernel: [<ffffffff813d0535>] md_make_request+0x85/0x230
kernel: [<ffffffff81091de0>] ? autoremove_wake_function+0x0/0x40
kernel: [<ffffffff81241652>] ? generic_make_request+0x1b2/0x4f0
kernel: [<ffffffff81241652>] generic_make_request+0x1b2/0x4f0
kernel: [<ffffffff8106333a>] ? find_busiest_group+0x96a/0xb40
kernel: [<ffffffffa03d8d9d>] ops_run_io+0x22d/0x330 [raid456]
kernel: [<ffffffff813d1ef6>] ? md_super_write+0xd6/0xe0
kernel: [<ffffffffa03db9f5>] handle_stripe+0x4d5/0x22e0 [raid456]
kernel: [<ffffffff81059db2>] ? finish_task_switch+0x42/0xd0
kernel: [<ffffffffa03ddc9f>] raid5d+0x49f/0x690 [raid456]
kernel: [<ffffffff813d182c>] md_thread+0x5c/0x130
kernel: [<ffffffff81091de0>] ? autoremove_wake_function+0x0/0x40
kernel: [<ffffffff813d17d0>] ? md_thread+0x0/0x130
kernel: [<ffffffff81091a76>] kthread+0x96/0xa0
kernel: [<ffffffff810141ca>] child_rip+0xa/0x20
kernel: [<ffffffff810919e0>] ? kthread+0x0/0xa0
kernel: [<ffffffff810141c0>] ? child_rip+0x0/0x20

I have included all additional info and logs about test setup in
separate attachments.
According to logs it seems that the bug is in mdadm, but I'm not sure
since I haven't found any similar reports anywhere.
It would be great if someone tried to reproduce it on their machine,
this shouldn't take long, maybe 20 minutes or so...
Attachment:
fdisk-info

Description: Binary data
Attachment:
messages

Description: Binary data
Attachment:
raid-conf

Description: Binary data
Attachment:
raid-details

Description: Binary data
Attachment:
raid-mdstat

Description: Binary data