Re: [PATCH v2] DM RAID: Add support for MD RAID10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 17 Jul 2012 11:15:56 -0500 Brassow Jonathan <jbrassow@xxxxxxxxxx>
wrote:

> 
> On Jul 16, 2012, at 9:34 PM, NeilBrown wrote:
> 
> >>>> 
> >>>> @@ -536,9 +605,25 @@ static int parse_raid_params(struct raid
> >>>> 	if (dm_set_target_max_io_len(rs->ti, max_io_len))
> >>>> 		return -EINVAL;
> >>>> 
> >>>> -	if ((rs->raid_type->level > 1) &&
> >>>> -	    sector_div(sectors_per_dev, (rs->md.raid_disks - rs->raid_type->parity_devs))) {
> >>>> +	if (rs->raid_type->level == 10) {
> >>>> +		/* (Len * Stripes) / Mirrors */
> >>>> +		sectors_per_dev *= rs->md.raid_disks;
> >>>> +		if (sector_div(sectors_per_dev, raid10_copies)) {
> >>>> +			rs->ti->error = "Target length not divisible by number of data devices";
> >>>> +			return -EINVAL;
> >>>> +		}
> >>> 
> >>> I'm not entirely sure what you are trying to do here, but I don't think it
> >>> works.
> >>> 
> >>> At the very least you would need to convert the "sectors_per_dev" number to a
> >>> 'chunks_per_dev' before multiplying by raid_disks and dividing by copies.
> >>> 
> >>> But even that isn't necessary.  If you have a 3-device near=2 array with an
> >>> odd number of chunks per device, that will still work.  The last chunk won't
> >>> be mirrored, so won't be used.
> >>> Until a couple of weeks ago a recovery of the last device would trigger an
> >>> error when we try to recover that last chunk, but that is fixed now.
> >>> 
> >>> So if you want to impose this limitation (and there is some merit in making
> >>> sure people don't confuse themselves), I suggest it get imposed in the
> >>> user-space tools which create the RAID10.
> >> 
> >> I have seen what you do in calc_sectors(), I suppose I could bring that in.  However, I'm simply trying to find the value that will be preliminarily assigned to mddev->dev_sectors.  It is an enforced (perhaps unnecessary) constraint that the array length be divisible by the number of data devices.  I'm not accounting for chunk boundaries - I leave that to userspace to configure and MD to catch.
> > 
> > In that case I suggest you keep it out of dmraid.
> > 
> > It might make sense to check that the resulting array size matches what
> > user-space said to expect - or is at-least-as-big-as.  However I don't see
> > much point in other size changes there.
> 
> I'm not changing any sizes.  I'm finding a value for mddev->dev_sectors (what should I set it to if not the above?), which is the size of each device as computed from the total array size.  If the value can't be computed evenly, then an error is given.  I'm not sure what you are suggesting my alternative is when you say, "keep it out of dm-raid"...  I think this is the least I can do in dm-raid.

Sorry, I mean 'checks', not 'changes'.

I was confused a bit by the code.  It might be clearer to have

   sectors_per_dev = ti->len * rs->md.raid_disks;
   remainder = sector_div(sectors_per_dev, raid10_copies);

In almost all cases raid10_copies will be 2 and ti->len will be a multiple of
8 (as we don't support chunk sizes less than 4K), so remainder will be 0.
However this doesn't really prove anything.  It is still entirely possible
that the resulting 'sectors_per_dev' will not result in a RAID10 which has
the full ti->len sectors required.
So your test is not sufficient.

Also your calculation is incorrect as it doesn't allow for the possibility of
unused space in the array.
A 3-drive 2-copy RAID10 with an odd number of chunks will not use the last
chunk on the last device. In that case the above calculation will result in
a sectors_per_dev that it too small.

I think your sectors_per_dev calculation should round up to a whole number of
chunks.

   sectors_per_dev = DIV_ROUND_UP_SECTOR_T(ti->len * rs->md.raid_disks,
                                           raid10_copies);
   sectors_per_dev = round_up(sectors_per_dev, rs->md.chunk_sectors);

And then after calling md_run(), you should check that
 pers->size(mddev, 0, 0) == ti->len
and abort if not.

Note that getting the sectors_per_dev right is really really important for
RAID10-far.  For other layouts and levels a smaller sectors_per_dev will just
result in less data being available.
For raid10-far, sectors_per_dev is included in the layout calculations so
changing it will change the location of data and could result in corruption.

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux