Race condition with mdadm at boot [still mystifying]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



This is a bit long-winded, but I wanted to share some info ....

Regarding my earlier message about a possible race condition with mdadm, 
I have been doing all sorts of poking around with the boot process. 
Thanks to a tip from Steven Yellin at Stanford, I found where to add a 
delay in the rc.sysinit script, which invokes mdadm to assemble the arrays.

Unfortunately it didn't help, so it likely wasn't a race condition after 
all.

However, on close examination of dmesg, I found something very 
interesting.  There were missing 'bind<sd??>' statements for one or the 
other hot spare drive (or sometimes both).  These drives are connected 
to the last PHYs in each SATA controller ... in other words they are the 
last devices probed by the driver for a particular controller.  It would 
appear that the drivers are bailing out before managing to enumerate all 
of the partitions on the last drive in a group, and missing partitions 
occur quite randomly.

So it may or may not be a timing issue between the WD Caviar Black 
drives and both the LSI and Marvell SAS/SATA controller chips.

So, I replaced the two drives (SATA-300) with two faster drives 
(SATA-600) on the off chance they might respond fast enough before the 
drivers move on to other duties.  That didn't help either.

Each group of arrays uses completely drivers (mptsas and sata_mv) but 
both exhibit the same problem, so I'm mystified as to where the real 
issue lies.  Anyone care to offer suggestions?

Chuck
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos


[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux