I'm seeing an occasional XFS IO hang with raid1 (INFO: task xfsaild/md20:17963
blocked for more than 120 seconds).
It turns out that this is because md_submit_flush_data calls pers->make_request,
and doesn't check the return value (unlike md_make_request, which checks the
return value and retries). So if raid1_make_request/md_write_start return false,
md_submit_flush_data drops the write on the floor.
I'm hitting this on a RHEL kernel, but looking at the upstream code it appears
that the same thing could happen.
Not sure how best to deal with this... thank you for any advice!
Nate
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html