Re: possible bus loading problem during resync

Asdo <asdo@xxxxxxxxxxxxx> · Tue, 09 Mar 2010 12:00:17 +0100

Kristleifur Daðason wrote:
On Tue, Mar 9, 2010 at 6:31 AM, Timothy D. Lenz <tlenz@xxxxxxxxxx> wrote:

I'm working on 2 systems that are mainly for running vdr. I've had these
running somewhat for awhile with raid. But a couple nights ago as I was
quitting for the night, I noticed one of the computers drive light staying
on. I had just made some changes to xine and didn't know if something had
crashed. Turned on the TV and found the video was freezing for 10-20secs
every 10-20secs. Logging in using putty and winscp I found it very sluggish
to respond.Starting top I found it was doing the regular array check/resync.......
--

Sorry about the incredibly brief answer: Not to dismiss other issues,
but that behavior seems like exactly what I've seen when a disk has
been failing.

If that is true, how does that happen, the driver is hung? But anyway, 
how can such things happen when there is more than one CPU-core?

try disabling NCQ by echo 1 > /sys/block/sdX/device/queue_depth for all 
drives. After doing this, at most 1 request can be issued to one drive 
until the drive has serviced such request.

After doing this, firstly I'd say the sluggishness should disappear, at 
least on SSH when not touching the disks. And then you can look with 
"iostat -x 1": probably the bad drive will have a service time (svctm) 
or await much worse than the others.

Just guesses, correct me if I'm wrong
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html