Re: Multisite sync corruption for large multipart obj

Xiaoxi Chen <superdebuger@xxxxxxxxx> · Tue, 28 May 2019 17:43:38 +0800



Hi Casey,
    Thanks for the reply.   I couldnt find the log back to that time
due to logroate remove anything older than 7 days...sorry for that.

    It looks to me like the issue was trigger by restart/failure on
src_zone rgw, which cause connection reset in our http client however
the error is not popping up to upper layer. Then we finished out the
object and adding metadata as-is.  The detecting we are missing in
this case is the integrity of src obj.  As we lost the multipart
information we cannot use ETAG to check the integrity for fetched obj.
Maybe  a simple size check can help us to some extent as the first
step (truncate is much more often than corruption as restart of RGW
can easily trigger the issue) , very weak but better than none...

     The STREAMING-AWS4-HMAC-SHA256-PAYLOAD it seems more for ensuring
the data integrity for putObj progress, but we are uploading through
rados, not sure how can we get use of this?

      Is there any possibility that we expose the multipart
information(including part size, checksum of each part, as well as
ETAG) from src zone though internal API? so that in
RGWRados::fetch_remote_obj we can keep the multipart format and do
integrity check?

-Xiaoxi

Casey Bodley <cbodley@xxxxxxxxxx> 于2019年5月24日周五 上午4:15写道：

>
>
> On 5/21/19 9:16 PM, Xiaoxi Chen wrote:
> > we have a two-zone multi-site setup, zone lvs and zone slc
> > respectively. It works fine in general however we got reports from
> > customer about data corruption/mismatch between two zone
> >
> > root@host:~# s3cmd -c .s3cfg_lvs ls
> > s3://ms-nsn-prod-48/01DAT9KVPEDE4QTA6EWFBZJ5KS/index
> > 2019-05-14 04:30 410444223 s3://ms-nsn-prod-48/01DAT9KVPEDE4QTA6EWFBZJ5KS/index
> > root@host-ump:~# s3cmd -c .s3cfg_slc ls
> > s3://ms-nsn-prod-48/01DAT9KVPEDE4QTA6EWFBZJ5KS/index
> > 2019-05-14 04:30 62158776 s3://ms-nsn-prod-48/01DAT9KVPEDE4QTA6EWFBZJ5KS/index
> >
> > Object metadata in SLC/LVS can be found in
> > https://pastebin.com/a5JNb9vb LVS
> > https://pastebin.com/1MuPJ0k1 SLC
> >
> > SLC is a single flat object while LVS is a multi-part object, which
> > indicate the object was uploaded by user in LVS and mirrored to
> > SLC.The SLC object get truncated after 62158776, the first 62158776
> > bytes are right.
> >
> > root@host:~# cmp -l slc_obj lvs_obj
> > cmp: EOF on slc_obj after byte 62158776
> >
> > Both bucket sync status and overall sync status shows positive, and
> > the obj was created 5 days ago. It sounds more like when pulling the
> > object content from source zone(LVS), the transaction was terminated
> > somewhere in between and cause an incomplete obj, and seems we dont
> > have checksum verification in sync_agent so that the corrupted obj was
> > there and be treated as a success sync.
> It's troubling to see that sync isn't detecting an error from the
> transfer. Do you see any errors from the http client in your logs such
> as 'WARNING: client->receive_data() returned ret='?
>
> I agree that we need integrity checking, but we can't rely on ETags
> because of the way that multipart objects sync as non-multipart. I think
> the right way to address this is to add v4 signature support to the http
> client, and rely on STREAMING-AWS4-HMAC-SHA256-PAYLOAD for integrity of
> the body chunks
> (https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html).
>
> >
> > root@host:~# radosgw-admin --cluster slc_ceph_ump bucket sync status
> > --bucket=ms-nsn-prod-48
> > realm 2305f95c-9ec9-429b-a455-77265585ef68 (metrics)
> > zonegroup 9dad103a-3c3c-4f3b-87a0-a15e17b40dae (ebay)
> > zone 6205e53d-6ce4-4e25-a175-9420d6257345 (slc)
> > bucket ms-nsn-prod-48[017a0848-cf64-4879-b37d-251f72ff9750.432063.48]
> >
> > source zone 017a0848-cf64-4879-b37d-251f72ff9750 (lvs)
> >                  full sync: 0/16 shards
> >                  incremental sync: 16/16 shards
> >                  bucket is caught up with source
> >
> >
> > Re-sync on the bucket will not solve the inconsistency
> Right. The GET requests that fetch objects use the If-Modified-Since
> header to avoid transferring data unless the mtime has changed. In order
> to force re-sync, you would have to do something that updates its mtime
> - for example, setting an acl.
> > radosgw-admin bucket sync init --source-zone lvs --bucket=ms-nsn-prod-48
> >
> > root@host:~# radosgw-admin bucket sync status --bucket=ms-nsn-prod-48
> > realm 2305f95c-9ec9-429b-a455-77265585ef68 (metrics)
> > zonegroup 9dad103a-3c3c-4f3b-87a0-a15e17b40dae (ebay)
> > zone 6205e53d-6ce4-4e25-a175-9420d6257345 (slc)
> > bucket ms-nsn-prod-48[017a0848-cf64-4879-b37d-251f72ff9750.432063.48]
> >
> > source zone 017a0848-cf64-4879-b37d-251f72ff9750 (lvs)
> >                  full sync: 0/16 shards
> >                  incremental sync: 16/16 shards
> >                  bucket is caught up with source
> >
> > root@lvscephmon01-ump:~# s3cmd -c .s3cfg_slc ls
> > s3://ms-nsn-prod-48/01DAT9KVPEDE4QTA6EWFBZJ5KS/index
> > 2019-05-14 04:30 62158776 s3://ms-nsn-prod-48/01DAT9KVPEDE4QTA6EWFBZJ5KS/index
> >
> >
> > A tracker was submitted to
> > https://tracker.ceph.com/issues/39992
>