Re: MD Raid1 hangs system on read error (3.10)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 5 Oct 2014 11:27:01 +0200 Matthijs Kooijman <matthijs@xxxxxxxx> wrote:

> Hey folks,
> 
> a few times now I've found my system being locked up after a read error
> from a hard disk. It looks like the MD code that handles the read error
> messes up and causes a GPF.
> 
> After this happens, the system becomes completely unresponsive - it
> responds to ping and opens TCP connections, but no data comes out. The
> serial console also gives no response.
> 
> After rebooting, the disk in question showed a pending sector. In the
> most recent occurence, I found that the array was also resyncing. I'm
> not sure if this also happened in the earlier occurences (but since the
> pending sector didn't disappear in the next day, I'd expect no resync
> happened before). The read errors always happened during a routine check
> of the array.

That last sentence is the clue I needed - thanks.

You are running 3.10.3.  It contains a bug that was fixed in 3.10.32.

http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=9f2d289933e60ec726a7a9522e2dcdfdc82c58de

So if you upgrade your kernel, the problem should be gone.

NeilBrown

Attachment: pgpJFnu9FfoiU.pgp
Description: OpenPGP digital signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux