Issues with exclusive-lock code on testing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Doug,

The cause of that memory corruption is a premature (duplicate, too)
call to rbd_obj_request_complete() in the !object-map DELETE case.
You've got:

    <dispatch>
    rbd_osd_req_callback
      rbd_osd_delete_callback
        rbd_osd_discard_callback
        rbd_obj_request_complete
          <complete obj_request->completion>
                                                <waiter is woken up>
                                                ...
                                                rbd_obj_request_put
                                                <obj_request is gone>
    <do more things with obj_request> <- !!!
    rbd_obj_request_complete
      <complete obj_request->completion>

I also spotted two memory leaks on the NOTIFY_COMPLETE path in
__do_event().  The event one is trivial, the page vector one I have
a question about.  The data item is allocated in alloc_msg() and the
actual buffer is then passed into __do_event() and eventually into
rbd_send_async_notify(), but not further up the stack.  Is anything
going to use it?  If not, we should remove it entirely.

Another thing that caught my eye is your diff adds a bunch of
ceph_get_snap_context() calls on header.snapc with no corresponding
puts.  My understanding is the ones around rbd_image_request_fill() are
there to workaround the fact that rbd_queue_workfn() isn't used, but
the one in rbd_obj_delete_sync() is immediately followed by
ceph_osdc_build_request() which bumps snapc and so is almost certainly
a leak.

The attached patch fixes the use-after-free and plugs those leaks.
With it applied your test loop runs fine for me - no crashes or out of
memory problems.

Thanks,

                Ilya
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 92c354256055..c0198b6ca605 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1930,14 +1930,14 @@ static void rbd_osd_delete_callback(struct rbd_obj_request *obj_request)
 	u8 current_state;
 
 	if (!obj_request->img_request) {
-		rbd_osd_complete_delete(obj_request);
+		rbd_osd_discard_callback(obj_request);
 		return;
 	}
 
 	rbd_dev = obj_request->img_request->rbd_dev;
 
 	if (!rbd_use_object_map(rbd_dev)) {
-		rbd_osd_complete_delete(obj_request);
+		rbd_osd_discard_callback(obj_request);
 		return;
 	}
 
@@ -3632,10 +3632,13 @@ static int rbd_send_async_notify(struct rbd_device *rbd_dev,
 	}
 
 	completed = ceph_osdc_wait_event(osdc, notify_event);
-	if (!completed)
+	if (!completed) {
 		ret = -ETIMEDOUT;
-	else
+	} else {
 		ret = notify_event->notify.return_code;
+		ceph_release_page_vector(notify_event->notify.notify_data,
+		    calc_pages_for(0, notify_event->notify.notify_data_len));
+	}
 
 cancel_event:
 	ceph_osdc_cancel_event(notify_event);
@@ -4828,7 +4831,6 @@ static int rbd_obj_delete_sync(struct rbd_device *rbd_dev,
 
 	//obj_request->osd_req->r_priv = obj_request;
 
-	ceph_get_snap_context(rbd_dev->header.snapc);
 	osd_req_op_init(obj_request->osd_req, 0, CEPH_OSD_OP_DELETE, 0);
 	rbd_osd_req_format_snap_write(obj_request, rbd_dev->header.snapc);
 
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 8316a304af63..12841c5a09c7 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -2942,6 +2942,7 @@ static void __do_event(struct ceph_osd_client *osdc, u8 opcode,
 					event->osd_req = NULL;
 				}
 				complete_all(&event->notify.complete);
+				ceph_osdc_put_event(event);
 			}
 			break;
 		default:

[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux