Re: [MD] Crash with 4.12+ kernel and high disk load -- bisected to 4ad23a976413: MD: use per-cpu counter for writes_pending

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I will apply this to my home server this evening (BST) and set off a check. Will have results tomorrow.

Thanks for the fix!

David


Quoting NeilBrown <neilb@xxxxxxxx>:

On Mon, Aug 07 2017, Dominik Brodowski wrote:

Neil, Shaohua,

following up on David R's bug message: I have observed something similar
on v4.12.[345] and v4.13-rc4, but not on v4.11. This is a RAID1 (on bare
metal partitions, /dev/sdaX and /dev/sdbY linked together). In case it
matters: Further upwards are cryptsetup, a DM volume group, then logical
volumes, and then filesystems (ext4, but also happened with xfs).

In a tedious bisect (the bug wasn't as quickly reproducible as I would like,
but happened when I repeatedly created large lvs and filled them with some
content, while compiling kernels in parallel), I was able to track this
down to:


commit 4ad23a976413aa57fe5ba7a25953dc35ccca5b71
Author: NeilBrown <neilb@xxxxxxxx>
Date:   Wed Mar 15 14:05:14 2017 +1100

    MD: use per-cpu counter for writes_pending

    The 'writes_pending' counter is used to determine when the
    array is stable so that it can be marked in the superblock
    as "Clean".  Consequently it needs to be updated frequently
    but only checked for zero occasionally.  Recent changes to
    raid5 cause the count to be updated even more often - once
    per 4K rather than once per bio.  This provided
    justification for making the updates more efficient.

    ...

Thanks for the report... and for bisecting and for re-sending...

I believe I have found the problem, and have sent a patch separately.

If mddev->safemode == 1 and mddev->in_sync != 0, md_check_recovery()
causes the thread that calls it to spin.
Prior to the patch you found, that couldn't happen.  Now it can,
so it needs to be handled more carefully.

While I was examining the code, I found another bug - so that is a win!

Thanks,
NeilBrown



--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux