On Tue, 7 Dec 2010 16:07:35 +0000 "Labun, Marcin" <Marcin.Labun@xxxxxxxxx> wrote: > >From 4bd19fb7b8a4258bf6cf34288be635bdb9af3dbe Mon Sep 17 00:00:00 2001 > From: Marcin Labun <marcin.labun@xxxxxxxxx> > Date: Wed, 30 Nov 2010 03:55:18 +0100 > Subject: [PATCH 1/2] IMSM: Fix problem in mdmon monitor of using removed disk from in imsm container. > > Manager thread shall pass the information to monitor thread (mdmon) > that some devices are removed from container. Otherwise, monitor (mdmon) > might use such devices (spares) to rebuild the array that has gone degraded. > > This problem happens for imsm containers, since a list of the container disks > is maintained in intel_super structure. When array goes degraded, the list is > searched to find a spare disks to start rebuild. > Without this fix the rebuild could be stared on the spare device that was > a member of the container, but has been removed from it. > > New super type function handler has been introduced to prepare metadata > format specific information about removed devices. > int (*remove_from_super)(struct supertype *st, mdu_disk_info_t *dinfo, > int fd); > The message prepared in remove_from_super is later processed > by proceess_update handler in monitor thread. I don't like this. There is unnecessary complexity. adding a disk and removing a disk are very different sorts of operations. When adding a disk, you need to pass extra information about how the disk might be used - whether it is already part of the array, or if it is a fresh spare or whatever. When removing a device there is none of that. Just remove the device. So when mdadm removes a device from a container it should - get a lock so mdmon won't assign the device as spare - check that the device is still a spare - remove the device from the container - unlock - ping mdmon mdmon should notice that the device has gone and should update the metadata accordingly. So you may still need a 'remove_from_super' method, but it will not send a metadata update request to mdmon. Rather it will be run by mdmon when it notices the device is gone. It is probably appropriate to pass an mdu_disk_info_t or maybe just a device number. I don't think there is any need to pass an 'fd'. Does that approach seem OK to you? Thanks, NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html