multisite replication issue with Quincy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We have encountered replication issues in our multisite settings with
Quincy v17.2.3.

Our Ceph clusters are brand new. We tore down our clusters and re-deployed
fresh Quincy ones before we did our test.
In our environment, we have 3 RGW nodes per site, each node has 2 instances
for client traffic and 1 instance dedicated for replication.

Our test was done using cosbench with the following settings:
- 10 rgw users
- 3000 buckets per user
- write only
- 6 different object sizes with the following distribution:
1k: 17%
2k: 48%
3k: 14%
4k: 5%
1M: 13%
8M: 3%
- trying to write 10 million objects per object size bucket per user to
avoid writing to the same objects
- no multipart uploads involved
The test ran for about 2 hours roughly from 22:50pm 9/14 to 1:00am 9/15.
And after that, the replication tail continued for another roughly 4 hours
till 4:50am 9/15 with gradually decreasing replication traffic. Then the
replication stopped and nothing has been going on in the clusters since.

While we were verifying the replication status, we found many issues.
1. The sync status shows the clusters are not fully synced. However all the
replication traffic has stopped and nothing is going on in the clusters.
Secondary zone:

          realm 8a98f19f-db58-4c09-bde6-ac89560d79b0 (prod-realm)
      zonegroup e041ea69-1e0b-4ad7-92f2-74b20aa3edf3 (prod-zonegroup)
           zone 1dadcf12-f44c-4940-8acc-9623a48b829e (prod-zone-tt)
  metadata sync syncing
                full sync: 0/64 shards
                incremental sync: 64/64 shards
                metadata is caught up with master
      data sync source: b68a526a-ffaa-4058-9903-6e7c6eac35bb (prod-zone-pw)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is behind on 2 shards
                        behind shards: [40,74]


Why the replication stopped even though the clusters are still not in-sync?

2. We can see some buckets are not fully synced and we are able to
identified some missing objects in our secondary zone.
Here is an example bucket. This is its sync status in the secondary zone.

          realm 8a98f19f-db58-4c09-bde6-ac89560d79b0 (prod-realm)
      zonegroup e041ea69-1e0b-4ad7-92f2-74b20aa3edf3 (prod-zonegroup)
           zone 1dadcf12-f44c-4940-8acc-9623a48b829e (prod-zone-tt)
         bucket
:mixed-5wrks-dev-4k-thisisbcstestload004178[b68a526a-ffaa-4058-9903-6e7c6eac35bb.89152.78])

    source zone b68a526a-ffaa-4058-9903-6e7c6eac35bb (prod-zone-pw)
  source bucket
:mixed-5wrks-dev-4k-thisisbcstestload004178[b68a526a-ffaa-4058-9903-6e7c6eac35bb.89152.78])
                full sync: 0/101 shards
                incremental sync: 100/101 shards
                bucket is behind on 1 shards
                behind shards: [78]

3. We can see from the above sync status, the behind shard for the example
bucket is not in the list of the behind shards for the system sync status.
Why is that?

4. Data sync status for these behind shards doesn't list any
"pending_buckets" or "recovering_buckets".
An example:

{
    "shard_id": 74,
    "marker": {
        "status": "incremental-sync",
        "marker": "00000000000000000003:00000000000003381964",
        "next_step_marker": "",
        "total_entries": 0,
        "pos": 0,
        "timestamp": "2022-09-15T00:00:08.718840Z"
    },
    "pending_buckets": [],
    "recovering_buckets": []
}


Shouldn't the not-yet-in-sync buckets be listed here?

5. The sync status of the primary zone is different from the sync status of
the secondary zone with different groups of behind shards. The same for the
sync status of the same bucket. Is it legitimate? Please see the item 1 for
sync status of the secondary zone, and the item 6 for the primary zone.

6. Why the primary zone has behind shards anyway since the replication is
from primary to the secondary?|
Primary Zone:

          realm 8a98f19f-db58-4c09-bde6-ac89560d79b0 (prod-realm)
      zonegroup e041ea69-1e0b-4ad7-92f2-74b20aa3edf3 (prod-zonegroup)
           zone b68a526a-ffaa-4058-9903-6e7c6eac35bb (prod-zone-pw)
  metadata sync no sync (zone is master)
      data sync source: 1dadcf12-f44c-4940-8acc-9623a48b829e (prod-zone-tt)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is behind on 30 shards
                        behind shards:
[6,7,26,28,29,37,47,52,55,56,61,67,68,69,74,79,82,91,95,99,101,104,106,111,112,121,122,123,126,127]

7. We have buckets in-sync that show correct sync status in secondary zone
but still show behind shards in primary. Why is that?
Secondary Zone:

          realm 8a98f19f-db58-4c09-bde6-ac89560d79b0 (prod-realm)
      zonegroup e041ea69-1e0b-4ad7-92f2-74b20aa3edf3 (prod-zonegroup)
           zone 1dadcf12-f44c-4940-8acc-9623a48b829e (prod-zone-tt)
         bucket
:mixed-5wrks-dev-4k-thisisbcstestload008167[b68a526a-ffaa-4058-9903-6e7c6eac35bb.89754.279])

    source zone b68a526a-ffaa-4058-9903-6e7c6eac35bb (prod-zone-pw)
  source bucket
:mixed-5wrks-dev-4k-thisisbcstestload008167[b68a526a-ffaa-4058-9903-6e7c6eac35bb.89754.279])
                full sync: 0/101 shards
                incremental sync: 99/101 shards
                bucket is caught up with source

Primary zone:

          realm 8a98f19f-db58-4c09-bde6-ac89560d79b0 (prod-realm)
      zonegroup e041ea69-1e0b-4ad7-92f2-74b20aa3edf3 (prod-zonegroup)
           zone b68a526a-ffaa-4058-9903-6e7c6eac35bb (prod-zone-pw)
         bucket
:mixed-5wrks-dev-4k-thisisbcstestload008167[b68a526a-ffaa-4058-9903-6e7c6eac35bb.89754.279])

    source zone 1dadcf12-f44c-4940-8acc-9623a48b829e (prod-zone-tt)
  source bucket
:mixed-5wrks-dev-4k-thisisbcstestload008167[b68a526a-ffaa-4058-9903-6e7c6eac35bb.89754.279])
                full sync: 0/101 shards
                incremental sync: 97/101 shards
                bucket is behind on 11 shards
                behind shards: [9,11,14,16,22,31,44,45,67,85,90]

Our primary goals here are:
1. to find out why the replication stopped while the clusters are not
in-sync;
2. to understand what we need to do resume the replication, and to make
sure it runs to the end without too much lagging;
3. to understand if all the sync status info is correct. Seems to us there
are many conflicts, and some doesn't reflect the real status of the
clusters at all.

I have opened an issue in the Issue Tracker:
https://tracker.ceph.com/issues/57562.
And more info regarding our clusters has been attached to the issue. It
includes the following:
- ceph.conf of rgws
- ceph config dump
- ceph versions output
- sync status of cluster, an in-sync bucket, a not-in-sync bucket, and some
behind shards
- bucket list and bucket stats of a not-in-sync bucket and stat of a
not-in-sync object

Thanks,
Jane
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux