Re: raidreconf

Jakob Østergaard <jakob@unthought.net> · Fri, 8 Feb 2002 09:03:20 +0100

On Fri, Feb 08, 2002 at 01:22:38AM +0200, Cajoline wrote:
> Hello again Jakob :)

Hello !   :)

...
> 
> Actually, in that case the controller was a HighPoint HPT366 and it was
> built into a custom built 2.4.9 kernel, not as module. I really have no
> idea why it wouldn't work, the raid driver would just say it can't find
> the partition on that drive that was moved from the onboard ide to the
> hpt controller.

Ok, now I'm not a driver/md-interaction wizard, but I'd say it's a bug.

Shouting "bug" at other people's code is easy (especially if they don't
know where you live)  ;)

...
> 
> OK I tried the chunk resizing, I have attached all the output from
> raidreconf.

No you haven't  ;)

I'd really love to see that output - can you re-send ?

> I did something a little out of the ordinary: converted from
> a 16 kb chunk-size to 128 kb chunk-size. I don't know if this could have
> affected what happened.
> Is 128 kb a usable chunk-size? As far as file size efficiency is
> concerned, 128 kb is still ok, the target filesystem (the one I plan to
> update soon) is reiserfs with a default 4k block size, but the files are
> considerably large.
> Is the conversion from 16k to 128k somewhat obsurd? Crazy? Too much?
> Impossible? :)

It should be perfectly possible

> 
> Now, as to what happened. First of all, the memory/request numbers:
> 
> Detected 224056 KB of physical memory in system
> A maximum of 1436 outstanding requests is allowed

Ok.  So the memory detection is ok.

> 
> I don't know how that translates in memory usage, however it was using
> about 30 mb of memory most of the time, when I was checking on it.
> Also, as you can see in the log, it says:
> 
> I will SHRINK your old device /dev/md0 of 6863248 blocks
> to a new device /dev/md0 of 6863240 blocks
> using a block-size of 16 KB

Aha !!

6863248 is a multiple of 4, but not a multiple of 128 !

So the conversion to 128k chunks is actually going to shrink your array.

> 
> The old and new raidtab were identical, apart from the chunk-size
> setting. As you can see, the new device it tried to make is 8 blocks
> smaller than the first one, and that is strange by itself, I think...
> Could the reason be the chunk-size change?

Yep, exactly.

> 
> Then at about 80% of the process, after about it ran out of memory. I
> started it around 11 PM, and at 8 AM it was at about 60%. All this on
> the same 110 GB md0 I had expanded with raidreconf the night before.
> Unfortunately I wasn't monitoring the process at the time this happened,
> and when I got back everything was normal, cpu load and mem usage, and
> there were only these messages on the terminal. So I don't really know
> what happened, however the box wasn't running anything else cpu or
> memory intensive, only raidreconf.
> 

raidreconf shouldn't run out of memory - it has a fixed-size request list and
is designed around that for the same reason.

Ok - I wonder if it's because it's shrinking the array.  That's something
that I haven't tested for a long time, and I don't know about the Quantum
guys, but I would assume that it's the "less interesting" case for most
people.

Now, if raidreconf will attempt to shrink an array, obviously it should
be something that *worked*.   Or I could stop it from attempting to 
shrink arrays.

I'd really like to see the output.

> 
> Now as for what you replied to my other e-mail, about raidreconf's
> behavior if it comes across bad sectors and can't read from or write to
> them properly, you are right; the best it could/should do is retry a few
> times and if it fails, just mark the block done and move on.
> The filesystem will have some errors, but hopefully they will only
> affect a very small number of blocks. If raidreconf aborts, then u
> probably need to mkraid and start from scratch, all over again, with a
> blank filesystem. So an erroneous fs is obviously much better than that
> of course :)

Yep - fsck cannot fix a half-done raidreconf run   ;)

> 
> That's all for now. Thanks again Jakob and everyone for your help and
> sharing your experience with raidreconf and these issues :)
> Best regards,
> Cajoline Leblanc
> cajoline at chaosengine dot de

Thank you, all,

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html