On Sun, 2011-03-27 at 18:35 -0700, NeilBrown wrote: > On Thu, 24 Mar 2011 19:40:46 -0700 Dan Williams <dan.j.williams@xxxxxxxxx> > wrote: > > > <context switch out of isci driver review mode> > > :-) > [..] > > - disk = get_imsm_disk(super, ord_to_idx(ord)); > > + dl = get_imsm_dl_disk(super, ord_to_idx(ord)); > > This sometimes return NULL, leading to bad stuff and mdmon crashing.... > > So there is more to this than meets the eye... Yes, (and I chalk this up to context switch latency), setting the index to -2 is not correct as other paths need to be able to reference a valid disk index until the failed device is removed via a rebuild. > I'll stop trying this patch. Ok, here is a proposed v2 on top of the latest devel-3.2, but I need to play with it a bit more, and figure out what the spare migration test is complaining about. diff --git a/super-intel.c b/super-intel.c index 6e12af2..e2f66aa 100644 --- a/super-intel.c +++ b/super-intel.c @@ -3993,7 +3993,7 @@ static int write_super_imsm(struct supertype *st, int doclose) /* write the mpb for disks that compose raid devices */ for (d = super->disks; d ; d = d->next) { - if (d->index < 0) + if (d->index < 0 || is_failed(&d->disk)) continue; if (store_imsm_mpb(d->fd, mpb)) fprintf(stderr, "%s: failed for device %d:%d %s\n", @@ -5218,6 +5218,8 @@ static int mark_failure(struct imsm_dev *dev, struct imsm_disk *disk, int idx) __u32 ord; int slot; struct imsm_map *map; + char buf[MAX_RAID_SERIAL_LEN+3]; + unsigned int len, shift = 0; /* new failures are always set in map[0] */ map = get_imsm_map(dev, 0); @@ -5230,6 +5232,11 @@ static int mark_failure(struct imsm_dev *dev, struct imsm_disk *disk, int idx) if (is_failed(disk) && (ord & IMSM_ORD_REBUILD)) return 0; + sprintf(buf, "%s:0", disk->serial); + if ((len = strlen(buf)) >= MAX_RAID_SERIAL_LEN) + shift = len - MAX_RAID_SERIAL_LEN + 1; + strncpy((char *)disk->serial, &buf[shift], MAX_RAID_SERIAL_LEN); + disk->status |= FAILED_DISK; set_imsm_ord_tbl_ent(map, slot, idx | IMSM_ORD_REBUILD); if (map->failed_disk_num == 0xff) -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html