On Tue, 9 Apr 2013 03:20:44 -1000 NightStrike <nightstrike@xxxxxxxxx> wrote: > Neil, > > I have been talking to a lot of irc types, and ultimately, they all > want me to just contact you directly (even people on > oftc/#kernelnewbies). The correct advice is to email linux-raid@xxxxxxxxxxxxxxx (you don't need to be subscribed) and to optionally Cc me. I've Cc:ed this reply to linux-raid. > > Basically, using mdadm with the Intel Matrix fakeraid doesn't work > right anymore. I can't tell you with what kernel it stopped, but I > can tell you that it is broken with archlinux 3.8.5, which is a > vanilla kernel plus a console logging patch. Here's what happens: > > 1) Create a new raid in the Option Rom > 2) Boot up and see md126 and md127, respectively ArchRaid_0 and imsm0. > 3) Run all sorts of read only things, like mdadm --detail-platform, all good > 4) Run cfdisk /dev/md/ArchRaid_0 (which points to md126), set up my partitions > 5) Hit "Write" to write the partition table > 6) Everything hangs I've had this reported on openSUSE too. I haven't yet had a chance to look into it properly. It sounds like "mdmon" is not running - or not working correctly. On the first attempt to write to the array, md signals mdmon and waits for the array to be marked "active". mdmon should notice this, update the metadata on the array to record that the array is active (so if a crash happens a resync will be force) and then tell md that the array is active. md notices and allows the write. Something in this sequence is not working. Does everything "unhang" if you run mdmon md127 & ?? > > At this point, any program that tries to do anything to the raid, or > to even mount other volumes, will hang. Here's a dmesg output of the > blocked stack traces: > > http://sprunge.us/BBMM > > derRichard on #kernelnewbies looked at that and said that everything > is stuck in __enqueue_entity, and that I have to contact the author of > mdadm to find out why. :( > > > Is there any way you can help? What other info do you need? And do > you know how I can reboot gracefully? Because right now, I'll only be > able to do a hard reset or an REISUB, because those threads are > forever blocked. Hard reset (or "echo b > /proc/sysrq-trigger") is fine. Nothing has been written to those devices so on restart it will look just like it did before. NeilBrown
Attachment:
signature.asc
Description: PGP signature