Re: Some md/mdadm bugs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Neil
thanks for the reply

version is:
mdadm - v3.1.4 - 31st August 2010
so it's indeed before 3.1.5
That's what is in Ubuntu latest stable 11.10, they are lagging behind

I'll break the quotes to add a few comments --->

On 02/02/12 22:17, NeilBrown wrote:
.....
I am wondering (and this would be very serious) what happens if a new
drives is inserted and it takes the /dev/sda identifier!? Would MD start
writing or do any operation THERE!?
Wouldn't happen.  As long as md hold onto the shell of the old sda nothing
else will get the name 'sda'.

Great!
indeed this was what I *suspected* based on the fact newly added drives got higher identifiers. It's good to hear it from a safe source though.

And here goes also a feature request:

if a device is detached from the system, (echo 1>  device/delete or
removing via hardware hot-swap + AHCI) MD should detect this situation
and mark the device (and all its partitions) as failed in all arrays, or
even remove the device completely from the RAID.
This needs to be done via a udev rule.
That is why --remove understands names like "sda6" (no /dev).

Then a device is removed, udev processes the remove notification.
The rule

ACTION=="remove", RUN+="/sbin/mdadm -If $name"

in /etc/udev/rules.d/something.rules

will make that happen.

Oh great!

Will use that.

--incremental --fail ! I would never have thought of combining those.


In my case I have verified that MD did not realize the device was
removed from the system, and only much later when an I/O was issued to
the disk, it would mark the device as failed in the RAID.

After the above is implemented, it could be an idea to actually allow a
new disk to take the place of a failed disk automatically if that would
be a "re-add" (probably the same failed disk is being reinserted by the
operator) and this even if the array is running, and especially if there
is a bitmap.
It should so that, providing you have a udev rule like:
ACTION=="add", RUN+="/sbin/mdadm -I $tempnode"

I think I have this rule.
But it doesn't work even via commandline if the array is running as I wrote below --->

You can even get it to add other devices as spares with e.g.
   policy action=force-spare

though you almost certainly don't want that general a policy.  You would
want to restrict that to certain ports (device paths).

sure, I understand

Now it doesn't happen:
When I reinserted the disk, udev triggered the --incremental, to
reinsert the device, but mdadm refused to do anything because the old
slot was still occupied with a failed+detached device. I manually
removed the device from the raid then I ran --incremental, but mdadm
still refused to re-add the device to the RAID because the array was
running. I think that if it is a re-add, and especially if the bitmap is
active, I can't think of a situation in which the user would *not* want
to do an incremental re-add even if the array is running.
Hmmm.. that doesn't seem right.  What version of mdadm are you running?

3.1.4

Maybe a newer one would get this right.
I need to try...
I think I need that.

Thanks for the reports.
thank you for your reply.

Asdo
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux