Re: [PATCH] rbd: handle parent_overlap on writes correctly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/11/2014 09:40 AM, Ilya Dryomov wrote:
The following check in rbd_img_obj_request_submit()

     rbd_dev->parent_overlap <= obj_request->img_offset

allows the fall through to the non-layered write case even if both
parent_overlap and obj_request->img_offset belong to the same RADOS
object.  This leads to data corruption, because the area to the left of
parent_overlap ends up unconditionally zero-filled instead of being
populated with parent data.  Suppose we want to write 1M to offset 6M
of image bar, which is a clone of foo@snap; object_size is 4M,
parent_overlap is 5M:

     rbd_data.<id>.0000000000000001
      ---------------------|----------------------|------------
     | should be copyup'ed | should be zeroed out | write ...
      ---------------------|----------------------|------------
    4M                    5M                     6M
                     parent_overlap    obj_request->img_offset

4..5M should be copyup'ed from foo, yet it is zero-filled, just like
5..6M is.

Given that the only striping mode kernel client currently supports is
chunking (i.e. stripe_unit == object_size, stripe_count == 1), round
parent_overlap up to the next object boundary for the purposes of the
overlap check.

Signed-off-by: Ilya Dryomov <ilya.dryomov@xxxxxxxxxxx>
---

Good catch! This should be included in any stable kernels 3.10 or later
too.

Reviewed-by: Josh Durgin <josh.durgin@xxxxxxxxxxx>

  drivers/block/rbd.c |   10 +++++++++-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 8295b3afa8e0..813e673d49df 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1366,6 +1366,14 @@ static bool obj_request_exists_test(struct rbd_obj_request *obj_request)
  	return test_bit(OBJ_REQ_EXISTS, &obj_request->flags) != 0;
  }

+static bool obj_request_overlaps_parent(struct rbd_obj_request *obj_request)
+{
+	struct rbd_device *rbd_dev = obj_request->img_request->rbd_dev;
+
+	return obj_request->img_offset <
+	    round_up(rbd_dev->parent_overlap, rbd_obj_bytes(&rbd_dev->header));
+}
+
  static void rbd_obj_request_get(struct rbd_obj_request *obj_request)
  {
  	dout("%s: obj %p (was %d)\n", __func__, obj_request,
@@ -2683,7 +2691,7 @@ static int rbd_img_obj_request_submit(struct rbd_obj_request *obj_request)
  	 */
  	if (!img_request_write_test(img_request) ||
  		!img_request_layered_test(img_request) ||
-		rbd_dev->parent_overlap <= obj_request->img_offset ||
+		!obj_request_overlaps_parent(obj_request) ||
  		((known = obj_request_known_test(obj_request)) &&
  			obj_request_exists_test(obj_request))) {



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux