Re: "Enhanced" MD code avaible for review

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Friday March 19, gibbs@scsiguy.com wrote:
> [ CC trimmed since all those on the CC line appear to be on the lists ... ]
> 
> Lets take a step back and focus on a few of the points to which we can
> hopefully all agree:
> 
> o Any successful solution will have to have "meta-data modules" for
>   active arrays "core resident" in order to be robust.  This
>   requirement stems from the need to avoid deadlock during error
>   recovery scenarios that must block "normal I/O" to the array while
>   meta-data operations take place.

I agree.
'Linear' and 'raid0' arrays don't really need metadata support in the
kernel as their metadata is essentially read-only.
There are interesting applications for raid1 without metadata, but I
think that for all raid personalities where metadata might need to be
updated in an error condition to preserve data integrity, the kernel
should know enough about the metadata to perform that update.

It would be nice to keep the in-kernel knowledge to a minimum, though
some metadata formats probably make that hard.

> 
> o It is desirable for arrays to auto-assemble based on recorded
>   meta-data.  This includes the ability to have a user hot-insert
>   a "cold spare", have the system recognize it as a spare (based
>   on the meta-data resident on it) and activate it if necessary to
>   restore a degraded array.

Certainly.  It doesn't follow that the auto-assembly has to happen
within the kernel.  Having it all done in user-space makes it much
easier to control/configure.

I think the best way to describe my attitude to auto-assembly is that
it could be needs-driven rather than availability-driven.

needs-driven means: if the user asks to access an array that doesn't
  exist, then try to find the bits and assemble it.
availability driven means: find all the devices that could be part of
  an array, and combine as many of them as possible together into
  arrays.

Currently filesystems are needs-driven.  At boot time, only to root
filesystem, which has been identified somehow, gets mounted. 
Then the init scripts mount any others that are needed.
We don't have any hunting around for filesystem superblocks and
mounting the filesystems just in case they are needed.

Currently partitions are (sufficiently) needs-driven.  It is true that
any partitionable devices has it's partitions presented.  However the
existence of partitions does not affect access to the whole device at
all.  Only once the partitions are claimed is the whole-device
blocked. 

Providing that auto-assembly of arrays works the same way (needs
driven), I am happy for arrays to auto-assemble.
I happen to think this most easily done in user-space.

With DDF format metadata, there is a concept of 'imported' arrays,
which basically means arrays from some other controller that have been
attached to the current controller.

Part of my desire for needs-driven assembly is that I don't want to
inadvertently assemble 'imported' arrays.
A DDF controller has NVRAM or a hardcoded serial number to help avoid
this.  A generic Linux machine doesn't.

I could possibly be happy with auto-assembly where a kernel parameter
of DDF=xx.yy.zz was taken to mean that we "need" to assemble all DDF
arrays that have a controler-id (or whatever it is) of xx.yy.zz.

This is probably simple enough to live entirely in the kernel.

> 
> o Child devices of an array should only be accessible through the
>   array while the array is in a configured state (bd_claim'ed).
>   This avoids situations where a user can subvert the integrity of
>   the array by performing "rogue I/O" to an array member.

bd_claim doesn't and (I believe) shouldn't stop access from
user-space.
It does stop a number of sorts of access that would expect exclusive
access. 


But back to your original post:  I suspect there is lots of valuable
stuff in your emd patch, but as you have probably gathered, big
patches are not the way we work around here, and with good reason.

If you would like to identify isolated pieces of functionality, create
patches to implement them, and submit them for review I will be quite
happy to review them and, when appropriate, forward them to
Andrew/Linus.
I suggest you start with less controversial changes and work your way
forward.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux