On Fri, May 2, 2008 at 12:26 AM, Neil Brown <neilb@xxxxxxx> wrote: > On Thursday May 1, dan.j.williams@xxxxxxxxx wrote: > > commit bd2ab67030e9116f1e4aae1289220255412b37fd "md: close a livelock > > window in handle_parity_checks5" introduced a bug in handling 'repair' > > operations. After a repair operation completes we clear the state bits > > tracking this operation. However, they are cleared too early and this > > results in the code deciding to re-run the parity check operation. Since > > we have done the repair in memory the second check does not find a mismatch > > and thus does not do a writeback. > > yes.... > I must admit that I find that code fairly hard to make sense of, but I > can see how it was failing before and how this fixes it, and testing > confirms that, so I suspect it is right. > > I cannot help feeling that there must be some way to simplify all > those .pending and .complete bits and make it somewhat clearer, but I > haven't been able to figure out how :-( > Agreed, the current scheme is not easily readable, and has proven tricky to manipulate. I will spend some cycles looking at this... > So: Acked-by: NeilBrown <neilb@xxxxxxx> > > I'm heading for a weekend, but feel free to send this to akpm. > > Thanks, > NeilBrown > Thanks, Dan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html