Re: [PATCH] md: fix raid5 'repair' operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Neil Brown wrote:
On Thursday May 1, dan.j.williams@xxxxxxxxx wrote:
commit bd2ab67030e9116f1e4aae1289220255412b37fd "md: close a livelock
window in handle_parity_checks5" introduced a bug in handling 'repair'
operations.  After a repair operation completes we clear the state bits
tracking this operation.  However, they are cleared too early and this
results in the code deciding to re-run the parity check operation.  Since
we have done the repair in memory the second check does not find a mismatch
and thus does not do a writeback.

yes....
I must admit that I find that code fairly hard to make sense of, but I
can see how it was failing before and how this fixes it, and testing
confirms that, so I suspect it is right.

I cannot help feeling that there must be some way to simplify all
those .pending and .complete bits and make it somewhat clearer, but I
haven't been able to figure out how :-(

So: Acked-by: NeilBrown <neilb@xxxxxxx>

I'm heading for a weekend, but feel free to send this to akpm.

Hmm.  Should this be sent to stable- as well?  I were just biten by
this very bug here, and after applying the patch and rebooting the
problem went away...  2.6.25.0 here.

/mjt
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux