On Thu, 2012-01-05 at 18:36 -0800, Tejun Heo wrote: > Hello, again. > > On Thu, Jan 05, 2012 at 06:17:07PM -0800, Tejun Heo wrote: > > When two requests are merged, if the absorbed request is older than > > the absorbing one, cfq_merged_requests() tries to reposition it in the > > cfqq->fifo list by list_move()'ing the absorbing request to the > > absorbed one before removing it. > > > > This works if both requests are on the same cfqq but nothing > > guarantees that and the code ends up moving the merged request to a > > different cfqq's fifo list without adjusting the rest. This leads to > > the following failures. > > > > * A request may be on the fifo list of a cfqq without holding > > reference to it and the cfqq can be freed before requst is finished. > > Among other things, this triggers list debug warning and slab debug > > use-after-free warning. > > > > * As a request can be on the wrong fifo queue, it may be issued and > > completed before its cfqq is scheduled. If the cfqq didn't have > > other requests on it, it would be empty by the time it's dispatched > > triggering BUG_ON() in cfq_dispatch_request(). > > > > Fix it by making cfq_merged_requests() scan the absorbing request's > > fifo list for the correct slot and move there instead. > > Hmmm... while the patch would fix the problem. It isn't entirely > correct. The root cause is, > > 1. q->last_merge and rqhash used to be used only for merging bios into > requests and that queries elevator whether the merge should be > allowed. cfq disallows merging if they belong to different cfqqs. > > 2. request-request merging didn't use to use q->last_merge or rqhash to > find request candidates. It used elv_former/latter_request() and > cfq never returned request from a different cfqq. > > 3. Plug merging started using q->last_merge and rqhash and now > elevator can't prevent cross cfqq merges. > > So, yeah, the right fix would be using elv_former/latter_request() > instead. Maybe we should strip out rqhash altogether and change > elevator handle everything? I don't know. I'll prepare a different > fix patch soon. So not allow merge from two cfq queues strictly? This will impact performance. I don't know how important the strict isolation is. we even allow two cfq queues merge to improve performance. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html