Re: Deadlock in md barrier code? / RAID1 / LVM CoW snapshot + ext3 / Debian 5.0 - lenny 2.6.26 kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/21/10 00:04, Neil Brown wrote:
>
> Maybe you could add a could of global atomic variables, one for reads and one
> for writes.
> Then on each call to generic_make_request in:
>   flush_pending_writes, make_request, raid1d
> increment one or the other depending on whether it is a read or a write.
> Then in raid1_end_read_request and raid1_end_write_request decrement them
> appropriately.
>   


Ended up with runs like this:

[  464.244109] 0 Obtaining resync_lock and disabling interrupts for
allow_barrier
[  464.244109] conf nr_pending is 5
[  464.244109] 0 Released resync_lock and enabling interrupts for
allow_barrier
[  464.244109] 2216: Post wait until no IO waiting - barrier: 0, pend:
5, wait: 0, queued: 0
[  464.244113] 2216: Pre wait pending to complete - barrier: 1, pend: 5,
wait: 0, queued: 0
[  464.244116] In raid1_unplug debug read count: 5 write count: 0
conf->nr_queued: 0
[  464.244469] 5113 Obtaining resync_lock and disabling interrupts for
wait_barrier
[  464.244469] In raid1_unplug debug read count: 5 write count: 0
conf->nr_queued: 0
[  464.244469] 5127 Obtaining resync_lock and disabling interrupts for
allow_barrier
[  464.244469] conf nr_pending is 4
[  464.244469] 5127 Released resync_lock and enabling interrupts for
allow_barrier
[  464.244469] 5127 Obtaining resync_lock and disabling interrupts for
allow_barrier
[  464.246176] In raid1_unplug debug read count: 4 write count: 0
conf->nr_queued: 0
[  464.244469] conf nr_pending is 3
[  464.244469] 5127 Released resync_lock and enabling interrupts for
allow_barrier
[  464.244469] 5127 Obtaining resync_lock and disabling interrupts for
allow_barrier
[  464.244469] conf nr_pending is 2
[  464.244469] 5127 Released resync_lock and enabling interrupts for
allow_barrier
[  464.246176] In raid1_unplug debug read count: 2 write count: 0
conf->nr_queued: 0
[  464.244469] 5127 Obtaining resync_lock and disabling interrupts for
allow_barrier
[  464.244469] conf nr_pending is 1
[  464.244469] 5127 Released resync_lock and enabling interrupts for
allow_barrier
[  464.246176] In raid1_unplug debug read count: 1 write count: 0
conf->nr_queued: 0
[  464.244469] 5127 Obtaining resync_lock and disabling interrupts for
allow_barrier
[  464.244469] conf nr_pending is 0
[  464.244469] 5127 Released resync_lock and enabling interrupts for
allow_barrier
[  464.244469] In raid1_unplug debug read count: 0 write count: 0
conf->nr_queued: 0
[  464.246176] 2216: Post wait pending to complete - barrier: 1, pend:
0, wait: 1, queued: 0
[  464.246176] In raid1_unplug debug read count: 0 write count: 0
conf->nr_queued: 0
[  464.246176] 2216: Pre wait until no IO waiting - barrier: 1, pend: 0,
wait: 1, queued: 0
[  464.246176] In raid1_unplug debug read count: 0 write count: 0
conf->nr_queued: 0
[  464.246176] 5118 Obtaining resync_lock and disabling interrupts for
wait_barrier
[  464.246176] In raid1_unplug debug read count: 0 write count: 0
conf->nr_queued: 0
[  464.246633] In raid1_unplug debug read count: 0 write count: 0
conf->nr_queued: 1
[  464.244469] In raid1_unplug debug read count: 0 write count: 0
conf->nr_queued: 0
[  464.246176] In raid1_unplug debug read count: 0 write count: 0
conf->nr_queued: 0
[  464.246990] 5118 Released resync_lock and enabling interrupts for
wait_barrier
[  464.244469] 5113 Released resync_lock and enabling interrupts for
wait_barrier
[  464.246990] 5118 Obtaining resync_lock and disabling interrupts for
wait_barrier
[  464.246990] 5118 Released resync_lock and enabling interrupts for
wait_barrier
[  464.246990] 5118 Obtaining resync_lock and disabling interrupts for
wait_barrier
[  464.246990] 5118 Released resync_lock and enabling interrupts for
wait_barrier
[  464.244469] In raid1_unplug debug read count: 3 write count: 0
conf->nr_queued: 0
[  464.246990] 5118 Obtaining resync_lock and disabling interrupts for
wait_barrier
[  464.246990] 5118 Released resync_lock and enabling interrupts for
wait_barrier
[  464.246990] 5118 Obtaining resync_lock and disabling interrupts for
wait_barrier
[  464.246990] 5118 Released resync_lock and enabling interrupts for
wait_barrier
[  464.246176] 2216: Post wait until no IO waiting - barrier: 0, pend:
6, wait: 0, queued: 0
[  464.246176] 2216: Pre wait pending to complete - barrier: 1, pend: 6,
wait: 0, queued: 0
[  464.246176] In raid1_unplug debug read count: 5 write count: 0
conf->nr_queued: 0
[  464.246990] 5118 Obtaining resync_lock and disabling interrupts for
wait_barrier
[  464.246990] In raid1_unplug debug read count: 6 write count: 0
conf->nr_queued: 0
[  464.247091] 0 Obtaining resync_lock and disabling interrupts for
allow_barrier
[  464.247091] conf nr_pending is 5
[  464.247091] 0 Released resync_lock and enabling interrupts for
allow_barrier
[  464.247091] In raid1_unplug debug read count: 5 write count: 0
conf->nr_queued: 0
[  464.244469] 5113 Obtaining resync_lock and disabling interrupts for
wait_barrier
[  464.246176] In raid1_unplug debug read count: 5 write count: 0
conf->nr_queued: 0
[  464.244469] In raid1_unplug debug read count: 5 write count: 0
conf->nr_queued: 0
[  464.328828] 3639 Obtaining resync_lock and disabling interrupts for
wait_barrier
[  464.328834] In raid1_unplug debug read count: 5 write count: 0
conf->nr_queued: 0
[  466.065137] 2479 Obtaining resync_lock and disabling interrupts for
wait_barrier
[  466.065137] In raid1_unplug debug read count: 5 write count: 0
conf->nr_queued: 0
[  467.051970] 2478 Obtaining resync_lock and disabling interrupts for
wait_barrier
[  467.051975] In raid1_unplug debug read count: 5 write count: 0
conf->nr_queued: 0
[  471.293667] In raid1_unplug debug read count: 5 write count: 0
conf->nr_queued: 0
[  471.525081] 220 Obtaining resync_lock and disabling interrupts for
wait_barrier
[  471.525081] In raid1_unplug debug read count: 5 write count: 0
conf->nr_queued: 0
[  505.308962] 2216 WARNING, schedule_timeout timed out for raise_barrier
[  505.308076] 5118 WARNING, schedule_timeout timed out for wait_barrier


... so does that mean that there are are read requests going awol in the
block layer?

If so, then the circumstantial evidence from when the lockups occur make
it look to me like this is probably an OpenVZ bug, but I'll try and do
some more digging tomorrow...

Does that make sense?

Cheers,

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux