On 7/7/20 12:08 AM, Song Liu wrote:
So, what kind of next step after this ?
Sorry for the delay. I read the log again, and found the following
line caused this issue:
[ +16.088243] r5l_write_super_and_discard_space set MD_SB_CHANGE_PENDING
The attached patch should workaround this issue. Could you please give it a try?
Yea, this solved the issue - the raid assembled correctly (so the patch
is probably a good candidate for lts kernels).
Thanks for helping with this bug.
Underlying filesystems are mountable/usable as well - albeit read-only
fsck (ext4) or btrfs check do find some minor issues; tough to say at
this point what was the exact culprit.
In this particular case - imho - one issue remains: the assembly is
slower than full resync (without bitmap), which outside of some
performance gains (writeback journal) and write-hole fixing - kind of
completely defeats the point of having such resync policy in the first
place.
dmesg -H | grep r5c_recovery_flush_log
[ +13.550877] r5c_recovery_flush_log processing ctx->seq 860700000
[Jul 7 15:16] r5c_recovery_flush_log processing ctx->seq 860800000
[Jul 7 15:40] r5c_recovery_flush_log processing ctx->seq 860900000
...
[Jul 8 06:40] r5c_recovery_flush_log processing ctx->seq 866300000
[Jul 8 06:58] r5c_recovery_flush_log processing ctx->seq 866400000
[Jul 8 07:20] r5c_recovery_flush_log processing ctx->seq 866500000
During those periods when I was testing your patches, the machine has
always been basically idle - no cpu/io/waits, or anything that could
hamper it. The read process going from the journal device (ssds) was
averaging 1-4 mb/s.