Re: Problem re-shaping RAID6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 18 Jun 2010 14:43:58 -0400
Bill Davidsen <davidsen@xxxxxxx> wrote:

> Neil Brown wrote:
> > On Sun, 13 Jun 2010 15:15:08 -0400
> > Jérôme Poulin <jeromepoulin@xxxxxxxxx> wrote:
> >
> >   
> >> I had problems reshaping my RAID6 down 1 disk today and found a
> >> problem in Grow.c:
> >>
> >> diff -udpr mdadm-3.1.2/Grow.c mdadm-3.1.2-critical-section/Grow.c
> >> --- mdadm-3.1.2/Grow.c  2010-03-09 23:31:39.000000000 -0500
> >> +++ mdadm-3.1.2-critical-section/Grow.c 2010-06-13 14:57:44.000000000 -0400
> >> @@ -497,7 +497,7 @@ int Grow_reshape(char *devname, int fd,
> >>         int rv = 0;
> >>         struct supertype *st;
> >>
> >> -       int nchunk, ochunk;
> >> +       unsigned long nchunk, ochunk;
> >>         int nlayout, olayout;
> >>         int ndisks, odisks;
> >>         int ndata, odata;
> >>
> >>
> >> After changing this I was able to re-shape the array, it seems it was
> >> overflowing and I had a message saying:
> >> root@billsshack:~/mdadm-3.1.2/ > ./mdadm --grow /dev/md0
> >> --raid-devices=6 --backup-file=/mnt/data1/md-raid6-grow-backup.bak
> >> mdadm: Need to backup 4503599627350016K of critical section..
> >> mdadm: /dev/md0: Something wrong - reshape aborted
> >>
> >> Now it works:
> >> root@billsshack:~/mdadm-3.1.2/ > ./mdadm --grow /dev/md0
> >> --raid-devices=6 --backup-file=/mnt/data1/md-raid6-grow-backup.bak
> >> mdadm: Need to backup 20480K of critical section..
> >> root@billsshack:~/mdadm-3.1.2/ > cat /proc/mdstat
> >> Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> >> [faulty] [linear]
> >> md0 : active raid6 dm-6[0] dm-8[5] dm-5[4] dm-3[3] dm-4[2] dm-7[1]
> >>       7814041600 blocks super 0.91 level 6, 1024k chunk, algorithm 2
> >> [6/5] [UUUUUU]
> >>       [>....................]  reshape =  0.0% (33792/1953510400)
> >> finish=1926.0min speed=16896K/sec
> >>
> >>
> >> Here's a nice output of GDB:
> >>       Breakpoint 1, Grow_reshape (devname=0x7fffffffe390 "/dev/md0",
> >> fd=5, quiet=0,
> >>           backup_file=0x7fffffffe3b8
> >> "/mnt/data1/md-raid6-grow-backup.bak", size=1953510400, level=65534,
> >>           layout_str=0x0, chunksize=0, raid_disks=6) at Grow.c:939
> >>       939                     blocks = ochunk/512 * nchunk/512 * odata
> >> * ndata / a;
> >>       (gdb) p ochunk/512
> >>       $9 = 2048
> >>       (gdb) p ochunk/512 * nchunk/512
> >>       $10 = -4194304
> >>     
> >
> >
> > Thanks.
> > I fixed this bug a slightly different way
> >
> > -               blocks = ochunk/512 * nchunk/512 * odata * ndata / a;
> > +               blocks = (ochunk/512) * (nchunk/512) * odata * ndata / a;
> >
> >
> >   
> I don't have a current C standard handy, but I was on the original X3J11 
> committee, and I'm fairly sure that even with that grouping, which may 
> work fine for gcc, the standard allows the compiler to ignore what you 
> did and do the optimization as long as the result "is the same as" doing 
> what you said, in math terms rather than real world. And since other 
> compilers (such as Intel C) are used to compile the kernel and may be in 
> the future, and since disks are getting larger and faster, I do prefer 
> the future proofing of the original solution.

Seriously?  It sounds pretty broken to allow any transformation on fixed-size
integers that would valid on real numbers... (You did say "real world"...)
Even anything valid on integers is quite possibly not valid on fixed-sized
numbers.

But if I were to change it I don't think I would make the variables bigger.
I would do something like

  ochunk_sectors = ochunk / 512;
  nchunk_sectors = nchunk / 512;

  blocks = ochunk_sectors * nchunk_sectors * odata * ndata / a;

but that just makes it clearer to me that if the guarantees there are
different to the guarantees for what I originally proposed, then there is
something seriously wrong with the language.

Do you have any reference to any standard that I could look at and try to
understand?

Thanks.
NeilBrown


> 
> > See
> > http://neil.brown.name/git?p=mdadm;a=commitdiff;h=200871adf9e15d5ad985f28c349fd89c386ef48a
> >
> > I guess it is time to release a 3.1.3...
> >
> > Thanks,
> > NeilBrown
> >   
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux