Object data loss in RGW when multipart upload completion times out and retried

Varada Kari <varada.kari@xxxxxxxxx> · Mon, 31 Jul 2017 19:56:11 +0530

Hi,

Recently in one of the clusters, lost some of objects if a multipart
complete upload times out and been retried before the first request
can be finished.  Please refer to the following events happened on the
cluster to result in a object loss.

2017-07-25 02:01:54.817492 7f4d67e86700  1 civetweb: 0x3bb3b90:
X.X.X.X - - [25/Jul/2017:02:01:51 +0530] "POST TESTBUCKET/file1.txt
HTTP/1.1" -1 0 - Boto/2.40.0 Python/2.7.3 Linux/3.2.0-4-amd64

2017-07-25 02:02:06.554676 7fdef1469700  1 civetweb: 0x45a1050:
X.X.X.X - - [25/Jul/2017:02:01:54 +0530] "PUT TESTBUCKET/file1.txt
HTTP/1.1" -1 0 - Boto/2.40.0 Python/2.7.3 Linux/3.2.0-4-amd64

2017-07-25 02:02:06.569207 7fb7eaf15700  1 civetweb: 0x3c66bd0:
X.X.X.X - - [25/Jul/2017:02:02:06 +0530] "GET TESTBUCKET/file1.txt
HTTP/1.1" 200 0 - Boto/2.40.0 Python/2.7.3 Linux/3.2.0-4-amd64

2017-07-25 02:03:29.650093 7f5f52f1d700  1 civetweb: 0x7f5fc40:
X.X.X.X - - [25/Jul/2017:02:02:06 +0530] "POST TESTBUCKET/file1.txt
HTTP/1.1" -1 0 - Boto/2.40.0 Python/2.7.3 Linux/3.2.0-4-amd64 <<==
first post request to complete the upload

2017-07-25 02:03:42.069385 7fe522ef5700  1 civetweb: 0x445aa10:
X.X.X.X - - [25/Jul/2017:02:03:16 +0530] "POST TESTBUCKET/file1.txt
HTTP/1.1" -1 0 - Boto/2.40.0 Python/2.7.3 Linux/3.2.0-4-amd64 <<==
same request been retried again. Both requests succeeded

2017-07-25 04:17:53.543050 7fb19ce31700  1 civetweb: 0x4021c10:
X.X.X.X - - [25/Jul/2017:04:17:53 +0530] "GET TESTBUCKET/file1.txt
HTTP/1.1" -1 0 - Boto/2.40.0 Python/2.7.3 Linux/3.2.0-4-amd64  << This
GET resulted in a 404

2017-07-25 04:17:53.548596 7f9062ffd700  1 civetweb: 0x44e0ee0:
X.X.X.X - - [25/Jul/2017:04:17:53 +0530] "HEAD TESTBUCKET/file1.txt
HTTP/1.1" -1 0 - Boto/2.40.0 Python/2.7.3 Linux/3.2.0-4-amd64 <<==
This is sucessful call and we can do a s3 ls to get the size(fetching
from omap) etc... but the object read fails.

Timeout happened when the cluster is in rebalance due to osd failure.
I can reproduce the same issue with help of delay in the following
code. And the multipart objects are marked for delete by gc when the
first  request succeds after second request finishes.

==========================

diff --git a/src/rgw/rgw_op.cc b/src/rgw/rgw_op.cc
index aae6372..0ecf842 100644
--- a/src/rgw/rgw_op.cc
+++ b/src/rgw/rgw_op.cc
@@ -5403,6 +5403,12 @@ void RGWCompleteMultipart::execute()
  if (op_ret < 0)
    return;

+  if (g_conf->rgw_introduce_latency) {
+    dout(0) << __func__ << "sleeping for 5 secs" << dendl;
+    sleep(5);
+  }
+
+
  // remove the upload obj
  int r = store->delete_obj(*static_cast<RGWObjectCtx *>(s->obj_ctx),
                           s->bucket_info, meta_obj, 0);

==========================

So far i am not able to come up with a fix where we can store the
state of the multipart complete upload progress and deleting the
meta_obj. This problem exists in all the releases and results in a
data loss. Following is the snippet from master branch.

Please find the logs and the required information below.

# bin/ceph --admin-daemon /tmp/ceph-asok.kXejsq/client.rgw.22202.asok
config get rgw_introduce_latency
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
{
   "rgw_introduce_latency": "true"
}

# bin/radosgw-admin gc list --include-all
[
   {
       "tag": "69510504-522a-4f6c-a169-62cb6c7eac51.4106.11\u0000",
       "time": "2017-07-31 14:07:02.0.532372s",
       "objs": [
           {
               "pool": "default.rgw.buckets.data",
               "oid":
"69510504-522a-4f6c-a169-62cb6c7eac51.4106.1__multipart_junkfile.2~4Br_wpZyqMbSpNniRBOxQbNij0efqrq.1",
               "key": "",
               "instance": ""
           },
           {
               "pool": "default.rgw.buckets.data",
               "oid":
"69510504-522a-4f6c-a169-62cb6c7eac51.4106.1__multipart_junkfile.2~4Br_wpZyqMbSpNniRBOxQbNij0efqrq.2",
               "key": "",
               "instance": ""
           },
           {
               "pool": "default.rgw.buckets.data",
               "oid":
"69510504-522a-4f6c-a169-62cb6c7eac51.4106.1__multipart_junkfile.2~4Br_wpZyqMbSpNniRBOxQbNij0efqrq.3",
               "key": "",
               "instance": ""
           },
           {
               "pool": "default.rgw.buckets.data",
               "oid":
"69510504-522a-4f6c-a169-62cb6c7eac51.4106.1__multipart_junkfile.2~4Br_wpZyqMbSpNniRBOxQbNij0efqrq.4",
               "key": "",
               "instance": ""
           },
           {
               "pool": "default.rgw.buckets.data",
               "oid":
"69510504-522a-4f6c-a169-62cb6c7eac51.4106.1__multipart_junkfile.2~4Br_wpZyqMbSpNniRBOxQbNij0efqrq.5",
               "key": "",
               "instance": ""
           }
       ]
   }
]

# bin/rados ls -p default.rgw.buckets.data
69510504-522a-4f6c-a169-62cb6c7eac51.4106.1_junkfile
69510504-522a-4f6c-a169-62cb6c7eac51.4106.1__multipart_junkfile.2~4Br_wpZyqMbSpNniRBOxQbNij0efqrq.2
69510504-522a-4f6c-a169-62cb6c7eac51.4106.1__multipart_junkfile.2~4Br_wpZyqMbSpNniRBOxQbNij0efqrq.3
69510504-522a-4f6c-a169-62cb6c7eac51.4106.1__multipart_junkfile.2~4Br_wpZyqMbSpNniRBOxQbNij0efqrq.4
69510504-522a-4f6c-a169-62cb6c7eac51.4106.1__multipart_junkfile.2~4Br_wpZyqMbSpNniRBOxQbNij0efqrq.5
69510504-522a-4f6c-a169-62cb6c7eac51.4106.1__multipart_junkfile.2~4Br_wpZyqMbSpNniRBOxQbNij0efqrq.1

Once the gc process the requests, except the head object rest of the
multiparts are deleted leading to the whole object loss. Omap contains
the meta information about the object but fails to read the object and
results in a 404.

Have filed a ticket with details at http://tracker.ceph.com/issues/20861.

Any help on fixing the issue is appreciated.

Varada
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html