raid1 IO hang - md_submit_flush_data ignores md_write_start return value

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm seeing an occasional XFS IO hang with raid1 (INFO: task xfsaild/md20:17963 blocked for more than 120 seconds).

It turns out that this is because md_submit_flush_data calls pers->make_request, and doesn't check the return value (unlike md_make_request, which checks the return value and retries). So if raid1_make_request/md_write_start return false, md_submit_flush_data drops the write on the floor.

I'm hitting this on a RHEL kernel, but looking at the upstream code it appears that the same thing could happen.

Not sure how best to deal with this... thank you for any advice!

Nate


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux