Re: More ddf container woes

NeilBrown <neilb@xxxxxxx> · Tue, 15 Mar 2011 15:43:25 +1100

On Mon, 14 Mar 2011 10:00:17 +0100 Albert Pauw <albert.pauw@xxxxxxxxx> wrote:

>   Hi Neil,
> 
> thanks, yes I noticed with the new git stuff some problems are fixed now.
> 
> I noticed one more thing:
> 
> When I look at the end of the output of the "mdadm -E /dev/md127" output I
> see it mentions the amount of phyiscal disks. When I fail a disk it is 
> marked as
> "active/Offline, Failed" which is good. When I remove it, the amount of 
> physical
> disks reported by the "mdadm -E" command stays the same, the RefNo is still
> there, the Size is still there, the Device file is removed and the state 
> is still
> "active/Offline, Failed". The whole entry should be removed and the 
> amount ofen
> physical disks lowered by one.

Well... maybe.  Probably.

The DDF spec "requires" that there be an entry in the "physical disks"
table for every disk that is connected to the controller - whether failed
or not.
That makes some sense when you think about a hardware-RAID controller.
But how does that apply when DDF is running on a host system rather than
a RAID controller??
Maybe we should only remove them when they are physically unplugged??

There would probably be value in thinking through all of this a lot more
but for now I have arranged to remove any failed device that it not
part of an array (even a failed part).

You can find all of this in my git tree.  I decided to back-port the
code from devel-3.2 which deletes devices from the DDF when you remove
them from a container - so you should find the code in the 'master'
branch works as well as that in 'devel-3.2'.

I would appreciate any more testing results that you come up with.

> 
> When I re-add the failed disk (but NOT zeroed the superblock) the state 
> is still
> "active/Offline, Failed" but reused for resynching a failed RAID set.
> 
> Assuming that the failed state of a disk is also recorded in the 
> superblock on the disk
> three different possibilities are likely when adding a disk:
> 
> - A clean new disk, a new superblock is created with a new RefNo
> - A failed disk is added, use the failed state and RefNo
> - A good disk is added, possibly from a good RAID set, use this 
> superblock with the
> RefNo and status. Make it possible to reassemble the RAID set when all 
> the disks
> are added.

It currently seems to preserve the 'failed' state.  While that may
not be ideal, it is not clearly 'wrong' and can be worked-around
by zeroing the metadata.

So I plan to leave it as it is for the moment.

I hope to put a bit of time in to sorting out some of these more subtle
issues next week - so you could well see progress in the future ...
especially if you have a brilliant idea about how it *should* work and manage
to convince me :-)

> 
> Thanks for the fixes so far,

And thank you for your testing.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html