On Jan 9, 2008 5:09 PM, Neil Brown <neilb@xxxxxxx> wrote: > On Wednesday January 9, dan.j.williams@xxxxxxxxx wrote: > > On Sun, 2007-12-30 at 10:58 -0700, dean gaudet wrote: > > > i have evidence pointing to d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1 > > > > > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1 > > > > > > which was Neil's change in 2.6.22 for deferring generic_make_request > > > until there's enough stack space for it. > > > > > > > Commit d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1 reduced stack utilization > > by preventing recursive calls to generic_make_request. However the > > following conditions can cause raid5 to hang until 'stripe_cache_size' is > > increased: > > > > Thanks for pursuing this guys. That explanation certainly sounds very > credible. > > The generic_make_request_immed is a good way to confirm that we have > found the bug, but I don't like it as a long term solution, as it > just reintroduced the problem that we were trying to solve with the > problematic commit. > > As you say, we could arrange that all request submission happens in > raid5d and I think this is the right way to proceed. However we can > still take some of the work into the thread that is submitting the > IO by calling "raid5d()" at the end of make_request, like this. > > Can you test it please? This passes my failure case. However, my test is different from Dean's in that I am using tiobench and the latest rev of my 'get_priority_stripe' patch. I believe the failure mechanism is the same, but it would be good to get confirmation from Dean. get_priority_stripe has the effect of increasing the frequency of make_request->handle_stripe->generic_make_request sequences. > Does it seem reasonable? What do you think about limiting the number of stripes the submitting thread handles to be equal to what it submitted? If I'm a stripe that only submits 1 stripe worth of work should I get stuck handling the rest of the cache? Regards, Dan - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html