deadlock between retry_aligned_read with barrier io

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



A chunk aligned read increases counter active_aligned_reads and
decreases it after sub-device handle it successfully. But when a read
error occurs,  the read redispatched by raid5d, and the
active_aligned_reads will not be decreased until we can grab a stripe
head in retry_aligned_read. Now suppose, a barrier io comes, set
conf->quiesce to 2, and wait until both active_stripes and
active_aligned_reads are zero. The retried chunk aligned read gets
stuck at get_active_stripe waiting until conf->quiesce becomes 0.
Retry_aligned_read and barrier io are waiting each other now.
One possible solution is that we ignore conf->quiesce, let the retried
aligned read finish. I reproduced this deadlock and test this patch on
centos6.0

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 9cd137e..8f94929 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4378,7 +4378,7 @@ static int  retry_aligned_read(raid5_conf_t
*conf, struct bio *raid_bio)
                        /* already done this stripe */
                        continue;

-               sh = get_active_stripe(conf, sector, 0, 1, 0);
+               sh = get_active_stripe(conf, sector, 0, 1, 1);

                if (!sh) {
                        /* failed to get a stripe - must wait */

any suggestion?
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux