Re: Joys of spare disks!

Molle Bestefich <molle.bestefich@xxxxxxxxx> · Thu, 3 Mar 2005 10:37:24 +0100

Guy <bugzilla@xxxxxxxxxxxxxxxx> wrote:

I generally agree with you, so I'm just gonna cite / reply to the
points where we don't :-).

> This sounded like Neil's current plan.  But if I understand the plan, the
> drive would be kicked out of the array.

Yeah, sounds bad.
Although it should be marked as "degraded" in mdstat, since there's
basically no redundancy until the failed blocks have been reassigned
somehow.

> And 1000 bad blocks!  I have never had 2 on the same disk at the
> same time.  AFAIK.  I would agree that 1000 would put a strain on the
> system!

Well, it happened to me on a Windows system, so I don't think that
that is far-fetched.

This was a desktop system with the case open, so it was bounced about a lot.

Every time the disk reached one of the faulty areas, it recalibrated
the head and then moved it out to try and read again.  It retried the
operation 5 times before giving up.  While this was ongoing, Windows
was frozen.  It took at least 3 seconds each time I hit a bad area,
and I think even more.

If MD could read from a disk while a similar scenario occurred, and
just mark the bad blocks for "rewriting" in some "bad block rewrite
bitmap" or whatever, a system hang could be avoided.  Trying to
rewrite every failed sector sequentially in the code that also reads
the data would incur a system hang.  That's what I tried to say
originally, though I probably didn't do a good job (I know little of
linux md, guess it shows =)).

Of course, the disks would, in the case of IDE, probably have to _not_
be in master/slave configurations, since the disk with failing blocks
could perhaps hog the bus.  Of course I know as little of ATA/IDE as I
do of linux MD, so I'm basically just guessing here ;-).

> Sometime in the past I have said there should be a threshold on the number
> of bad blocks allowed.  Once the threshold is reached, the disk should be
> assumed bad, or at least failing, and should be replaced.

Hm.  Why?
If a re-write on the block succeeds and then a read on the block
returns the correct data, the block has been fixed.  I can see your
point on old disks where it might be a magnetic problem that was
causing the sector to fail, but on a modern disk, it has probably been
relocated to the spare area.  I think the disk should just be failed
when a rewrite-and-verify cycle still fails.  The threshold suggestion
adds complexity and user-configurability (error-prone) to an area
where it's not really needed, doesn't it?

Another note.  I'd like to see MD being able to have a
user-specifiable "bad block relocation area", just like modern disks
have.  It could use this when the disks spare area filled up.  I even
thought up a use case at one time that wasn't insane like, "my disks
is really beginning to show up a lot of failures now, but I think I'll
keep it running a bit more", but I can't quite reminisce what it was.

> Does anyone know how many spare blocks are on a disk?

It probably varies?
Ie. crappy disks probably have a much too small area ;-).
In this case it would be very cute if MD had an option to specify it's
own relocation area (and perhaps even a recommendation for the user on
how to set it wrt. specific harddisks).
But OTOH, it sucks to implement features in MD that would be much
easier to solve in the disks by just expanding the spare area (when
present).

> My worse disk has 28 relocated bad blocks.

Doesn't sound bad.
Isn't there a SMART value that will show you how big a percentage of
spare is used (0-255)?
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html