On Apr 23, 2009, at 1:51 AM, Luca Berra wrote:
On Wed, Apr 22, 2009 at 09:20:49PM -0400, Doug Ledford wrote:On Apr 22, 2009, at 12:06 PM, Andrew Burgess wrote:On Tue, 2009-04-21 at 20:15 +0200, Piergiorgio Sartor wrote:This might be a Fedora 10 issue, so maybe Doug would like to comment. After reboot, someone, I guess udev, tries to automagically start a RAID, so it assembles /dev/md_d127 with one of the two components of /dev/md/boot (randomly, it seems). Later, when /dev/md/boot is assembled, one drive is "busy", because it belongs to /dev/md_d127, and the array is put together degraded, i.e. with the other disk only.Just a "me too". I also started seeing this after upgrading to fedora10. I had to create a startup script to stop md_d0 and reassemble everything else.Yeah, I found the cause for this while working on F11. The problem is a race condition between udev and a call to mdadm -As in the rc.sysinit. For F11, I solved this by making udev not process devices using incremental mode if we are still in the rc.sysinit script. You can change /etc/udev/rules.d/70-mdadm.rules (I think that's the right name, it might be slightly off) to read something like this:# This file causes block devices with Linux RAID (mdadm) signatures to# automatically cause mdadm to be run. # See udev(8) for syntaxSUBSYSTEM=="block", ACTION=="add", ENV{ID_FS_TYPE}=="linux_raid_member", \IMPORT{program}="/sbin/mdadm --examine --export $tempnode", \RUN+="/bin/bash -c '[ ! -f /dev/.in_sysinit ] && mdadm -I $env{DEVNAME}'"i believe i saw this as well, but not at startup, it was when i manuallyrun mdadm -As, so while your hack to prevent udev from assembling devices while in sysinit may not be a full solution.
No, it is. In your situation, the rules line must have read ACTION="add|change". The fact that the incremental assembly rule would watch a change event means that when mdadm opens any device to scan for a superblock and then closes it, it would trigger the rule (yes, just opening and closing the device special file will trigger a change event), which would then race with mdadm using it for its own purposes. The rule above does not watch change events, only add events. Those only happen once when the device is added, not when mdadm scans the devices looking for superblocks. Any time mdadm races with itself, one trying to assemble and one trying to do incremental assembly, you get split arrays with neither one started.
my solution was "rm -f /etc/udev/rules.d/70-mdadm.rules", works like a charm :P probably the best solution is preventing concurrent mdadm rules with a lock. Regards, L. -- Luca Berra -- bluca@xxxxxxxxxx Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN X AGAINST HTML MAIL / \ --To unsubscribe from this list: send the line "unsubscribe linux- raid" inthe body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
-- Doug Ledford <dledford@xxxxxxxxxx> GPG KeyID: CFBFF194 http://people.redhat.com/dledford InfiniBand Specific RPMS http://people.redhat.com/dledford/Infiniband
Attachment:
PGP.sig
Description: This is a digitally signed message part