Re: RAID6 gets stuck during reshape with 100% CPU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29/10/2019 19:05, Anssi Hannula wrote:
As mentioned in my first message and seen in http://onse.fi/files/reshape-infloop-issue/examine-all.txt , the MD bad block lists contain blocks (suspiciously identical across devices). So maybe the code can't properly handle the case where 10 devices have the same block in their bad block list. Not quite sure what "handle" should mean in this case but certainly something else than a handle_stripe() loop :) There is a "bad" block on 10 devices on sector 198504960, which I guess matches sh->sector 198248960 due to data offset of 256000 sectors (per --examine).

I've wondered if "dd if=/dev/md0 of=/dev/md0" for the affected blocks would clear the bad blocks and avoid this issue, but I haven't tried that yet so that the infinite loop issue can be investigated/fixed first. I already checked that /dev/md0 is fully readable (which also confuses me a bit since md(8) says "Attempting to read from a known bad block will cause a read error"... maybe I'm missing something).

Hmmm ...

Bear in mind that bad-blocks is considered by many an anti-feature, and it's strongly suspected that identical bad-block lists across multiple disks is a bug ...

I hesitate to suggest trying to clear the bad-blocks but doing a dd will definitely not do what you want - the md bad blocks list is implemented within the md layer, so doing something with dd is unlikely to touch it.

Plus, as a software implementation, you should NEVER under normal circumstances have any bad blocks - it doesn't make sense - so it's pretty certain you've fallen foul of a bug in the bad blocks setup.

Sorry I can't offer any solutions, other than very hesitantly suggesting just a --remove-badblocks --force or whatever the option is.

Hopefully this gives you a few ideas ...

Cheers,
Wol



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux