RE: raidreconf

"Cajoline" <cajoline@chaosengine.de> · Fri, 8 Feb 2002 13:31:43 +0200

Oh duh! I'm sorry, here's the file.

> -----Original Message-----
> From: Jakob プstergaard [mailto:jakob@unthought.net]
> Sent: Friday, February 08, 2002 10:03 AM
> To: Cajoline
> Cc: linux-raid@vger.kernel.org
> Subject: Re: raidreconf
> 
> On Fri, Feb 08, 2002 at 01:22:38AM +0200, Cajoline wrote:
> > Hello again Jakob :)
> 
> Hello !   :)
> 
> ...
> >
> > Actually, in that case the controller was a HighPoint HPT366 and it
was
> > built into a custom built 2.4.9 kernel, not as module. I really
have no
> > idea why it wouldn't work, the raid driver would just say it can't
find
> > the partition on that drive that was moved from the onboard ide to
the
> > hpt controller.
> 
> Ok, now I'm not a driver/md-interaction wizard, but I'd say it's a
bug.
> 
> Shouting "bug" at other people's code is easy (especially if they
don't
> know where you live)  ;)
> 
> ...
> >
> > OK I tried the chunk resizing, I have attached all the output from
> > raidreconf.
> 
> No you haven't  ;)
> 
> I'd really love to see that output - can you re-send ?
> 
> > I did something a little out of the ordinary: converted from
> > a 16 kb chunk-size to 128 kb chunk-size. I don't know if this could
have
> > affected what happened.
> > Is 128 kb a usable chunk-size? As far as file size efficiency is
> > concerned, 128 kb is still ok, the target filesystem (the one I
plan to
> > update soon) is reiserfs with a default 4k block size, but the
files are
> > considerably large.
> > Is the conversion from 16k to 128k somewhat obsurd? Crazy? Too much?
> > Impossible? :)
> 
> It should be perfectly possible
> 
> >
> > Now, as to what happened. First of all, the memory/request numbers:
> >
> > Detected 224056 KB of physical memory in system
> > A maximum of 1436 outstanding requests is allowed
> 
> Ok.  So the memory detection is ok.
> 
> >
> > I don't know how that translates in memory usage, however it was
using
> > about 30 mb of memory most of the time, when I was checking on it.
> > Also, as you can see in the log, it says:
> >
> > I will SHRINK your old device /dev/md0 of 6863248 blocks
> > to a new device /dev/md0 of 6863240 blocks
> > using a block-size of 16 KB
> 
> Aha !!
> 
> 6863248 is a multiple of 4, but not a multiple of 128 !
> 
> So the conversion to 128k chunks is actually going to shrink your
array.
> 
> >
> > The old and new raidtab were identical, apart from the chunk-size
> > setting. As you can see, the new device it tried to make is 8 blocks
> > smaller than the first one, and that is strange by itself, I
think...
> > Could the reason be the chunk-size change?
> 
> Yep, exactly.
> 
> >
> > Then at about 80% of the process, after about it ran out of memory.
I
> > started it around 11 PM, and at 8 AM it was at about 60%. All this
on
> > the same 110 GB md0 I had expanded with raidreconf the night before.
> > Unfortunately I wasn't monitoring the process at the time this
happened,
> > and when I got back everything was normal, cpu load and mem usage,
and
> > there were only these messages on the terminal. So I don't really
know
> > what happened, however the box wasn't running anything else cpu or
> > memory intensive, only raidreconf.
> >
> 
> raidreconf shouldn't run out of memory - it has a fixed-size request
list
> and
> is designed around that for the same reason.
> 
> Ok - I wonder if it's because it's shrinking the array.  That's
something
> that I haven't tested for a long time, and I don't know about the
Quantum
> guys, but I would assume that it's the "less interesting" case for
most
> people.
> 
> Now, if raidreconf will attempt to shrink an array, obviously it
should
> be something that *worked*.   Or I could stop it from attempting to
> shrink arrays.
> 
> I'd really like to see the output.
> 
> >
> > Now as for what you replied to my other e-mail, about raidreconf's
> > behavior if it comes across bad sectors and can't read from or
write to
> > them properly, you are right; the best it could/should do is retry
a few
> > times and if it fails, just mark the block done and move on.
> > The filesystem will have some errors, but hopefully they will only
> > affect a very small number of blocks. If raidreconf aborts, then u
> > probably need to mkraid and start from scratch, all over again,
with a
> > blank filesystem. So an erroneous fs is obviously much better than
that
> > of course :)
> 
> Yep - fsck cannot fix a half-done raidreconf run   ;)
> 
> >
> > That's all for now. Thanks again Jakob and everyone for your help
and
> > sharing your experience with raidreconf and these issues :)
> > Best regards,
> > Cajoline Leblanc
> > cajoline at chaosengine dot de
> 
> Thank you, all,
> 
> --
> ................................................................
> :   jakob@unthought.net   : And I see the elder races,         :
> :.........................: putrid forms of man                :
> :   Jakob ístergaard      : See him rise and claim the earth,  :
> :        OZ9ABN           : his downfall is at hand.           :
> :.........................:............{Konkhra}...............:

Attachment:
raidreconf-chunk-resize.log

Description: Binary data