On 04/13/2010 05:28 AM, Tomáš Dulík wrote: > Hi Doug, > > first of all: thanks for your work on hot-unplug! > I am new to Linux RAID, have been using HW RAID before but after my LSI > controller burned to ashes I decided I don't want to see HW RAID ... ever. > > First thing I found weird on Linux RAID was the missing support for dead > device removal. > I spent last 3 weeks trying to write various scripts for UDEV "remove" > and mdadm "Fail" events handling, but finally I found the same thing > like you - it is not possible to remove dead device from an array, > because the events are issued too late. The only way to remove dead > device is reboot, which is not what I would expect as solution in Linux > world. > > So I downloaded your code from Neil's git > (http://neil.brown.name/git?p=mdadm;a=shortlog;h=refs/heads/hotunplug) > and also applied the "Minor incremental fixup" mentioned in your message > below. > > The compiled mdadm works OK for normal operations (--fail, --remove, > --add), but crashes with Segmentation fault for the "--incremental > --fail" operation if I use it for a disk that I have just disconnected. > Here is what I've got: > > # gdb --args ./mdadm -If sda3 > GNU gdb 6.8-debian > This GDB was configured as "x86_64-linux-gnu"... > (gdb) run > Starting program: /root/mdadm-git/mdadm/mdadm -If sda3 > Program received signal SIGSEGV, Segmentation fault. > 0x000000000040a796 in mdstat_by_component (name=0x7fff0d0aee83 "sda3") > at mdstat.c:351 > 351 if (ent->metadata_version && > (gdb) where > #0 0x000000000040a796 in mdstat_by_component (name=0x7fff0d0aee83 > "sda3") at mdstat.c:351 > #1 0x000000000042411c in IncrementalRemove (devname=0x7fff0d0aee83 > "sda3", verbose=0) at Incremental.c:867 > #2 0x00000000004075a7 in main (argc=3, argv=0x7fff0d0ad698) at > mdadm.c:1545 > > It does not matter if I use sda3 or sda, the result is the same. > What am I doing wrong? There was a thinko in Neil's patch that is fixed with the attached patch. -- Doug Ledford <dledford@xxxxxxxxxx> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband
From b937950110190ce00f16d91a3423a66fde080a95 Mon Sep 17 00:00:00 2001 From: Doug Ledford <dledford@xxxxxxxxxx> Date: Tue, 13 Apr 2010 13:12:59 -0400 Subject: [PATCH 1/4] [hotunplug] we are testing mdstat, not ent which is undefined at this point Signed-off-by: Doug Ledford <dledford@xxxxxxxxxx> --- mdstat.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/mdstat.c b/mdstat.c index 58d349d..3bb74fa 100644 --- a/mdstat.c +++ b/mdstat.c @@ -348,9 +348,9 @@ struct mdstat_ent *mdstat_by_component(char *name) while (mdstat) { struct dev_member *m; struct mdstat_ent *ent; - if (ent->metadata_version && - strncmp(ent->metadata_version, "external:", 9) == 0 && - is_subarray(ent->metadata_version+9)) + if (mdstat->metadata_version && + strncmp(mdstat->metadata_version, "external:", 9) == 0 && + is_subarray(mdstat->metadata_version+9)) /* don't return subarrays, only containers */ ; else for (m = mdstat->members; m; m = m->next) { -- 1.6.6.1
Attachment:
signature.asc
Description: OpenPGP digital signature