rgw: bucket granularity sync

Yehuda Sadeh-Weinraub <yehuda@xxxxxxxxxx> · Tue, 3 Dec 2019 16:13:04 -0800

The following PR implements bucket granularity sync. We are aiming
this feature to land in time for Octopus.

https://github.com/ceph/ceph/pull/31686

Bucket granularity sync provides fine grained control of data movement
between buckets in different zones. It extends the existing zone sync
mechanism. In its core the feature modified the way the rgw sync
process treats buckets. Previously buckets were being treated
symmetrically, that is -- each (data) zone holds a mirror of that
bucket that should be the same as all the other zones. Whereas now it
is possible for buckets to diverge, and a bucket can pull data from
other buckets (ones that don't share its name or its ID) in different
zone.
The sync process was assuming therefore that the bucket sync source
and the bucket sync destination were always referring to the same
bucket, now that is not the case anymore.

A new sync policy that can supersede the old zonegroup coarse
configuration (sync_from*) was implemented. The sync policy can be
configured at the zonegroup level (and if it is configured it replaces
the old style config), but it can also be configured at the bucket
level.

In the new sync policy we can define multiple groups that can contain
lists of data-flow configurations, and lists of pipe configurations.
The data-flow define the flow of data between the different zones. It
can define symmetrical data flow, in which multiple zones sync data
from each other, and it can define directional data flow, in which the
data moves in one way from one zone to another.
A pipe defines the actual buckets that can use these data flow, and
the properties that are associated with it (for example: source object
prefix).

A sync policy group can be in 3 states:

enabled: sync is allowed and enabled
allowed: sync is allowed
forbidden: sync (as defined by this group) is not allowed and can
override other groups.

A policy can be defined at the bucket level. A bucket level sync
policy inherits the data flow of the zonegroup policy, and can only
define a subset of what the zonegroup allows.

A wildcard zone, and a wildcard bucket parameter in the policy defines
all relevant zones, or all relevant buckets. In the context of a
bucket policy it means the current bucket instance.
A disaster recovery configuration where entire zones are mirrored
doesn't require configuring anything on the buckets. However, for a
fine grained bucket sync it would be better to configure the pipes to
be synced by allowing (status=allowed) them at the zonegroup level
(e.g., using wildcards), but only enable the specific sync at the
bucket leve (status=enabled)l. If needed, the policy at the bucket
level can limit the data movement to specific relevant zones.

Any changes to the zonegroup policy will need to be applied on the
zonegroup master zone, and require period update and commit. Changes
to the bucket policy will need to be applied on the zonegroup master
zone. The changes are dynamically handled by rgw.

New radosgw-admin commands to control this feature were added:

sync policy get
sync group <create | modify | get | remove>
sync group flow <create | remove>
sync group pipe <create | remove>
sync info

Most are self explanatory. The notable one is sync info, which
provides info about the expected sources and targets of the sync
process at the current zone (or of another, effective zone), either at
the zone level, or at the bucket level.

Since a bucket can now define a policy that defines data movement from
it towards a different bucket at a different zone, when the policy is
created we also generate a list of bucket dependencies that are used
as hints when a sync of any particular bucket happens. The fact that a
bucket reference another bucket doesn't mean it actually sync to/from
it, as the data flow might not permit it.

Bucket sync can also be limited to specific source object prefixes.

The S3 bucket replication api has also been implemented, and allows
users to create replication rules between different buckets. Note
though that while the AWS replication feature allows bucket
replication within the same zone, rgw does not allow it at the moment.
However, the rgw api also added a new 'Zone' array that allows users
to select to what zones the specific bucket will be synced to.

Following are some usage examples:

The system in these examples includes 3 zones: us-east (the master
zone), us-west, us-west-2.

*  Example 1: Two zones, complete mirror:

This is similar to current sync capabilities, but being done via the
new sync policy engine. Note that changes to the zonegroup sync policy
require a period update and commit.

[us-east] $ radosgw-admin sync group create --group-id=group1 --status=allowed
[us-east] $ radosgw-admin sync group flow create --group-id=group1 \
                          --flow-id=flow-mirror --flow-type=symmetrical \
                          --zones=us-east,us-west
[us-east] $ radosgw-admin sync group pipe create --group-id=group1 \
                          --pipe-id=pipe1 --source-zones='*' \
                          --source-bucket='*' --dest-zones='*' \
                          --dest-bucket='*'
[us-east] $ radosgw-admin sync group modify --group-id=group1 --status=enabled
[us-east] $ radosgw-admin period update --commit

$ radosgw-admin sync info --bucket=buck
{
    "sources": [
        {
            "id": "pipe1",
            "source": {
                "zone": "us-west",
                "bucket": "buck:115b12b3-....4409.1"
            },
            "dest": {
                "zone": "us-east",
                "bucket": "buck:115b12b3-....4409.1"
            },
            "params": {
...
            }
        }
    ],
    "dests": [
        {
            "id": "pipe1",
            "source": {
                "zone": "us-east",
                "bucket": "buck:115b12b3-....4409.1"
            },
            "dest": {
                "zone": "us-west",
                "bucket": "buck:115b12b3-....4409.1"
            },
           ...
        }
    ],
    ...
    }
}

Note that the "id" field in the output above reflects the pipe rule
that generated that entry, a single rule can generate multiple sync
entries as can be seen in the example.

[us-west] $ radosgw-admin sync info --bucket=buck
{
    "sources": [
        {
            "id": "pipe1",
            "source": {
                "zone": "us-east",
                "bucket": "buck:115b12b3-....4409.1"
            },
            "dest": {
                "zone": "us-west",
                "bucket": "buck:115b12b3-....4409.1"
            },
            ...
        }
    ],
    "dests": [
        {
            "id": "pipe1",
            "source": {
                "zone": "us-west",
                "bucket": "buck:115b12b3-....4409.1"
            },
            "dest": {
                "zone": "us-east",
                "bucket": "buck:115b12b3-....4409.1"
            },
           ...
        }
    ],
    ...
}

*  Example 2: Directional entire zone backup

Also similar to current sync capabilities. In here we add a third
zone, us-west-2 that will be a replica of us-west, but data will not
be replicated back from it.

[us-east] $ radosgw-admin sync group flow create --group-id=group1 \
                          --flow-id=us-west-backup --flow-type=directional \
                          --source-zone=us-west --dest-zone=us-west-2
[us-east] $ radosgw-admin period update --commit

Note that us-west has two dests:

[us-west] $ radosgw-admin sync info --bucket=buck
{
    "sources": [
        {
            "id": "pipe1",
            "source": {
                "zone": "us-east",
                "bucket": "buck:115b12b3-....4409.1"
            },
            "dest": {
                "zone": "us-west",
                "bucket": "buck:115b12b3-....4409.1"
            },
           ...
        }
    ],
    "dests": [
        {
            "id": "pipe1",
            "source": {
                "zone": "us-west",
                "bucket": "buck:115b12b3-....4409.1"
            },
            "dest": {
                "zone": "us-east",
                "bucket": "buck:115b12b3-....4409.1"
            },
           ...
        },
        {
            "id": "pipe1",
            "source": {
                "zone": "us-west",
                "bucket": "buck:115b12b3-....4409.1"
            },
            "dest": {
                "zone": "us-west-2",
                "bucket": "buck:115b12b3-....4409.1"
            },
           ...
        }
    ],
    ...
}

Whereas us-west-2 has only source and no destinations:

[us-west-2] $ radosgw-admin sync info --bucket=buck
{
    "sources": [
        {
            "id": "pipe1",
            "source": {
                "zone": "us-west",
                "bucket": "buck:115b12b3-....4409.1"
            },
            "dest": {
                "zone": "us-west-2",
                "bucket": "buck:115b12b3-....4409.1"
            },
           ...
        }
    ],
    "dests": [],
    ...
}

*  Example 3: Mirror a specific bucket

Using the same group configuration, but this time switching it to
'allowed' state, which means that sync is allowed but not enabled.

[us-east] $ radosgw-admin sync group modify --group-id=group1 --status=allowed
[us-east] $ radosgw-admin period update --commit

And we will create a bucket level policy rule for existing bucket
buck2. Note that the bucket needs to exist before being able to set
this policy, and admin commands that modify bucket policies need to
run on the master zone, however, they do not require period update.
There is no need to change the data flow, as it is inherited from the
zonegroup policy. A bucket policy flow will only be a subset of the
flow defined in the zonegroup policy. Same goes for pipes, although a
bucket policy can enable pipes that are not enabled (albeit not
forbidden) at the zonegroup policy.

[us-east] $ radosgw-admin sync group create --bucket=buck2 \
                          --group-id=buck2-default --status=enabled

[us-east] $ radosgw-admin sync group pipe create --bucket=buck2 \
                          --group-id=buck2-default --pipe-id=pipe1 \
                          --source-zones='*' --dest-zones='*'

*  Example 4: Limit bucket sync to specific zones:

This will only sync buck3 to us-east (from any zone that flow allows
to sync into us-east).

[us-east] $ radosgw-admin sync group create --bucket=buck3 \
                          --group-id=buck3-default --status=enabled

[us-east] $ radosgw-admin sync group pipe create --bucket=buck3 \
                          --group-id=buck3-default --pipe-id=pipe1 \
                          --source-zones='*' --dest-zones=us-east

*  Example 5: sync from a different bucket

Note that bucket sync only works (currently) across zones and not
within the same zone.

Set buck4 to pull data from buck5

[us-east] $ radosgw-admin sync group create --bucket=buck4 '
                          --group-id=buck4-default --status=enabled

[us-east] $ radosgw-admin sync group pipe create --bucket=buck4 \
                          --group-id=buck4-default --pipe-id=pipe1 \
                          --source-zones='*' --source-bucket=buck5 \
                          --dest-zones='*'

can also limit it to specific zones, for example the following will
only sync data originated in us-west:

[us-east] $ radosgw-admin sync group pipe modify --bucket=buck4 \
                          --group-id=buck4-default --pipe-id=pipe1 \
                          --source-zones=us-west --source-bucket=buck5 \
                          --dest-zones='*'

Checking the sync info for buck5 on us-west is interesting:

[us-west] $ radosgw-admin sync info --bucket=buck5
{
    "sources": [],
    "dests": [],
    "hints": {
        "sources": [],
        "dests": [
            "buck4:115b12b3-....14433.2"
        ]
    },
    "resolved-hints-1": {
        "sources": [],
        "dests": [
            {
                "id": "pipe1",
                "source": {
                    "zone": "us-west",
                    "bucket": "buck5"
                },
                "dest": {
                    "zone": "us-east",
                    "bucket": "buck4:115b12b3-....14433.2"
                },
                ...
            },
            {
                "id": "pipe1",
                "source": {
                    "zone": "us-west",
                    "bucket": "buck5"
                },
                "dest": {
                    "zone": "us-west-2",
                    "bucket": "buck4:115b12b3-....14433.2"
                },
                ...
            }
        ]
    },
    "resolved-hints": {
        "sources": [],
        "dests": []
    }
}

Note that there are resolved hints, which means that the bucket buck5
found about buck4 syncing from it indirectly, and not from its own
policy (the policy for buck5 itself is empty).

*  Example 6: Sync to different bucket

The same mechanism can work for configuring data to be synced to (vs.
synced from as in the previous example). Note that internally data is
still pulled from the source at the destination zone:

Set buck6 to "push" data to buck5

[us-east] $ radosgw-admin sync group create --bucket=buck6 \
                          --group-id=buck6-default --status=enabled

[us-east] $ radosgw-admin sync group pipe create --bucket=buck6 \
                          --group-id=buck6-default --pipe-id=pipe1 \
                          --source-zones='*' --source-bucket='*' \
                          --dest-zones='*' --dest-bucket=buck5

A wildcard bucket name means the current bucket in the context of
bucket sync policy.

Combined with the configuration in Example 5, we can now write data to
buck6 on us-east, data will sync to buck5 on us-west, and from there
it will be distributed to buck4 on us-east, and on us-west-2.

*  Example 7: source filters

Sync from buck8 to buck9, but only objects that start with 'foo/':

[us-east] $ radosgw-admin sync group create --bucket=buck8 \
                          --group-id=buck8-default --status=enabled

[us-east] $ radosgw-admin sync group pipe create --bucket=buck8 \
                          --group-id=buck8-default --pipe-id=pipe-prefix \
                          --prefix=foo/ --source-zones='*' --dest-zones='*' \
                          --dest-bucket=buck9

Also sync from buck8 to buck9 any object that has the tags color=blue
or color=red

[us-east] $ radosgw-admin sync group pipe create --bucket=buck8 \
                          --group-id=buck8-default --pipe-id=pipe-tags \
                          --tags-add=color=blue,color=red --source-zones='*' \
                          --dest-zones='*' --dest-bucket=buck9

And we can check the expected sync in us-east (for example):

[us-east] $ radosgw-admin sync info --bucket=buck8
{
    "sources": [],
    "dests": [
        {
            "id": "pipe-prefix",
            "source": {
                "zone": "us-east",
                "bucket": "buck8:115b12b3-....14433.5"
            },
            "dest": {
                "zone": "us-west",
                "bucket": "buck9"
            },
            "params": {
                "source": {
                    "filter": {
                        "prefix": "foo/",
                        "tags": []
                    }
                },
                ...
            }
        },
        {
            "id": "pipe-tags",
            "source": {
                "zone": "us-east",
                "bucket": "buck8:115b12b3-....14433.5"
            },
            "dest": {
                "zone": "us-west",
                "bucket": "buck9"
            },
            "params": {
                "source": {
                    "filter": {
                        "tags": [
                            {
                                "key": "color",
                                "value": "blue"
                            },
                            {
                                "key": "color",
                                "value": "red"
                            }
                        ]
                    }
                },
                ...
            }
        }
    ],
    ...
}

Note that there aren't any sources, only two different destinations
(one for each configuration). When the sync process happens it will
select the relevant rule for each object it syncs.

Prefixes and tags can be combined, in which object will need to have
both in order to be synced. The priority param can also be passed, and
it can be used to determine when there are multiple different rules
that are matched (and have the same source and destination), to
determine which of the rules to be used.

*  Example 8: destination params: storage class

Storage class of the destination objects can be configured:

[us-east] $ radosgw-admin sync group create --bucket=buck10 \
                          --group-id=buck10-default --status=enabled

[us-east] $ radosgw-admin sync group pipe create --bucket=buck10 \
                          --group-id=buck10-default \
                          --pipe-id=pipe-storage-class \
                          --source-zones='*' --dest-zones=us-west-2 \
                          --storage-class=CHEAP_AND_SLOW

*  Example 9: destination params: destination owner translation

Set the destination objects owner as the destination bucket owner.
This requires specifying the uid of the destination bucket:

[us-east] $ radosgw-admin sync group create --bucket=buck11 \
                          --group-id=buck11-default --status=enabled

[us-east] $ radosgw-admin sync group pipe create --bucket=buck11 \
                          --group-id=buck11-default --pipe-id=pipe-dest-owner \
                          --source-zones='*' --dest-zones='*' \
                          --dest-bucket=buck12 --dest-owner=joe

*  Example 10: destination params: user mode

User mode makes sure that the user has permissions to both read the
objects, and write to the destination bucket. This requires that the
uid of the user (which in its context the operation executes) is
specified.

[us-east] $ radosgw-admin sync group pipe modify --bucket=buck11 \
                          --group-id=buck11-default --pipe-id=pipe-dest-owner \
                          --mode=user --uid=jenny

Please let me know if you have any questions. This might be tweaked a
little bit, and there are a couple of additions that I would like to
make, but at the moment that's where things stand.

Yehuda
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx