On Fri, 02 Aug 2013 21:32:59 +0200 Martin Wilck <mwilck@xxxxxxxx> wrote: > Neil, > > This calls free() from monitor context - I am not certain if that's > allowed and if no, what alternative there might be. Yes, I don't really like that. And there is an 'fd' attached to the 'dl' which would need to be closed too. This relates to the "ddf: remove failed devices that are no longer in use ?!?" discussion we had earlier. That patch isn't really right ... as you noticed by trying to fix it. I think this fix is safer and more in the spirit of the original. It appears to fix the problem for me, and feels "right" as well:-) It only removed pd entries for devices that really have disappeared. If a device still exists in ->dlist, we don't remove the pde. I applied the other patches in the series - thank especially for the added test! Thanks, NeilBrown From 92939eb29175a0dc7c9c46ff70f95b76b693b796 Mon Sep 17 00:00:00 2001 From: NeilBrown <neilb@xxxxxxx> Date: Mon, 5 Aug 2013 14:56:23 +1000 Subject: [PATCH] DDF: fix removal of failed devices. Commit c7079c84 arrange for DDF to forget about any device that is failed and not still marked as part of any array. However such devices could still be part of the container and this removal and updating of 'pdnum' can result in multiple devices having the same pdnum. This in turn easily leads to confusion and corruption. So only discard pd entries for devices which are failed, not listed in any virtual device, and for which we don't have a handle on the device. pd entries will not get removed until a new device is added after the device has been removed from the container, either by "mdadm --remove" or by assembling without the failed devices. Reported-by: Albert Pauw <albert.pauw@xxxxxxxxx> Analysed-by: Martin Wilck <mwilck@xxxxxxxx> Signed-off-by: NeilBrown <neilb@xxxxxxx> diff --git a/super-ddf.c b/super-ddf.c index 20f4f25..b352a52 100644 --- a/super-ddf.c +++ b/super-ddf.c @@ -4635,13 +4635,19 @@ static void ddf_process_update(struct supertype *st, */ pd2 = 0; for (pdnum = 0; pdnum < be16_to_cpu(ddf->phys->used_pdes); - pdnum++) + pdnum++) { if (be16_and(ddf->phys->entries[pdnum].state, cpu_to_be16(DDF_Failed)) && be16_and(ddf->phys->entries[pdnum].state, - cpu_to_be16(DDF_Transition))) - /* skip this one */; - else if (pdnum == pd2) + cpu_to_be16(DDF_Transition))) { + /* skip this one unless in dlist*/ + for (dl = ddf->dlist; dl; dl = dl->next) + if (dl->pdnum == (int)pdnum) + break; + if (!dl) + continue; + } + if (pdnum == pd2) pd2++; else { ddf->phys->entries[pd2] = @@ -4651,6 +4657,7 @@ static void ddf_process_update(struct supertype *st, dl->pdnum = pd2; pd2++; } + } ddf->phys->used_pdes = cpu_to_be16(pd2); while (pd2 < pdnum) { memset(ddf->phys->entries[pd2].guid, 0xff,
Attachment:
signature.asc
Description: PGP signature