I think I've got it. I'll explain, and we'll have to look at the explanation more closely in the morning. Bottom line is that I think the assertion is bogus. Like the other assertion that was not done under protection of a lock, this one is being done in a way that's not safe. First, here's a patch that I think might avoid the problem: ---------------------------------- Index: b/drivers/block/rbd.c =================================================================== --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -2130,9 +2130,8 @@ static void rbd_img_obj_callback(struct rbd_assert(which < img_request->obj_request_count); spin_lock_irq(&img_request->completion_lock); - if (which > img_request->next_completion) + if (which != img_request->next_completion) goto out; - rbd_assert(which == img_request->next_completion); for_each_obj_request_from(img_request, obj_request) { rbd_assert(more); ---------------------------------- Here's what I think is happening. I'll annotate the function, below. First, we enter this function when an object request has been marked "done." In the case we're looking at, we have an image request with two (or more) object requests. static void rbd_img_obj_callback(struct rbd_obj_request *obj_request) { struct rbd_img_request *img_request; u32 which = obj_request->which; bool more = true; rbd_assert(obj_request_img_data_test(obj_request)); img_request = obj_request->img_request; dout("%s: img %p obj %p\n", __func__, img_request, obj_request); rbd_assert(img_request != NULL); rbd_assert(img_request->obj_request_count > 0); rbd_assert(which != BAD_WHICH); rbd_assert(which < img_request->obj_request_count); Up to here, all is fine. Well, *maybe*... Anyway, I'll work through that in the morning. In any case, we are examining fairly static fields in the object request. spin_lock_irq(&img_request->completion_lock); if (which > img_request->next_completion) goto out; At this point we decide whether the object request now completing is the next one. If it's not, we're done; whenever an earlier request (the first one) completes it will ensure this one will get completed as well, below. But that's a problem. In the loop below, the only condition we're testing in order to account the completion of the current *and subsequent* requests is whether each request is marked "done." What that means is that while the second request is waiting to get the spinlock, the first one might be concurrently going through the loop below. It finds the second one "done" and records that fact. When it finishes the loop, it updates the next_completion value and releases the lock. The second request then enters the protected area, and finds that its "which" value is *not* greater than the next completion value. It's in fact *less* than the next_completion value, because the completion of this second request has already been processed. The problem got worse when we moved the spinlock (i.e., added protection around checking the next_completion field) because now the second thread is actually waiting before checking, so it's pretty much guaranteed it will trigger the failure. OK, back to bed. -Alex rbd_assert(which == img_request->next_completion); for_each_obj_request_from(img_request, obj_request) { rbd_assert(more); rbd_assert(which < img_request->obj_request_count); if (!obj_request_done_test(obj_request)) break; more = rbd_img_obj_end_request(obj_request); which++; } rbd_assert(more ^ (which == img_request->obj_request_count)); img_request->next_completion = which; out: spin_unlock_irq(&img_request->completion_lock); if (!more) rbd_img_request_complete(img_request); } -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html