Re: [EXTERNAL] Re: Bucket Notifications v2 & Multisite Redundancy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks 🙂

I've raised:
Bug #68102: rgw: "radosgw-admin topic list" may contain duplicated data and redundant nesting - rgw - Ceph<https://tracker.ceph.com/issues/68102?next_issue_id=984>
Enhancement #68104: rgw: Add a "disable replication" flag to bucket notification configuration - rgw - Ceph<https://tracker.ceph.com/issues/68104>

Also re-including the mailing list as it was dropped.

________________________________
From: Yuval Lifshitz <ylifshit@xxxxxxxxxx>
Sent: Tuesday, September 17, 2024 10:36 AM
To: Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx>
Subject: Re: [EXTERNAL] Re:  Bucket Notifications v2 & Multisite Redundancy

ok, got it.
not even sure we support attribute filtering for these types of events... i think you have to go with v1 for now.

would be great if you also submit a tracker for the v1 topic list

On Tue, Sep 17, 2024 at 12:24 PM Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx<mailto:alexhus@xxxxxxxxxxxxx>> wrote:
Indeed, I am using the s3:Replication:Create event (which I think is equivalent to s3:ObjectSynced:Create). But this does not solve the problem of the event being added to both topics on the site that received the replication.

________________________________
From: Yuval Lifshitz <ylifshit@xxxxxxxxxx<mailto:ylifshit@xxxxxxxxxx>>
Sent: Tuesday, September 17, 2024 10:21 AM
To: Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx<mailto:alexhus@xxxxxxxxxxxxx>>
Subject: Re: [EXTERNAL] Re:  Bucket Notifications v2 & Multisite Redundancy

if you want to get a notification when an object is synced, you should use a different type of notification. the fact that an object is uploaded to siteB does not mean it is immediately synced to siteA.
would recommend using s3:ObjectSynced:Create event type in this case. you will get this event only when an object is synced.

On Tue, Sep 17, 2024 at 12:05 PM Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx<mailto:alexhus@xxxxxxxxxxxxx>> wrote:
Using the sites sounds sensible, and probably better than using my suggestion of application deployment name as they are not subject to changing (and actually we do genuinely use siteA and siteB exclusively as the zone names,).

I want the siteA application to be notified of replication changes synced across from PUTs made on siteB, so I need the flip of what you are suggesting, i.e. create notification on siteA with filter "x-amz-metadata-site" == "siteB".

That's fine I think (and doesn't really complicate the scenario), but requires the application to now be aware of the zone it is, which isn't information it has without some plumbing (and doesn't seem like something an application would typically be aware of?). That's on top of the plumbing to add the metadata header into every place where I do an S3 PUT.

I still think I prefer using v1 and attempting to contribute an enhancement here to set a flag to disable multisite on a per notification basis, given the above; assuming you agree that this is a sensible enhancement.

________________________________
From: Yuval Lifshitz <ylifshit@xxxxxxxxxx<mailto:ylifshit@xxxxxxxxxx>>
Sent: Monday, September 16, 2024 6:15 PM
To: Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx<mailto:alexhus@xxxxxxxxxxxxx>>
Subject: Re: [EXTERNAL] Re:  Bucket Notifications v2 & Multisite Redundancy

regarding the filter. i don't really follow.
on siteA create a notification (with id "notifA") with filter "x-amz-metadata-site" == "siteA" that point to topicA (reachable in site A)
and on siteB create a notification (with id "notifB") with filter "x-amz-metadata-site" == "siteB" that point to topicB (reachable in siteB)

assuming we are in v2 and both notifications exist on the bucket.
when app is uploading an object to a bucket in siteA, it should set "siteA" in the "x-amz-metadata-site" attribute.
notifA will match the filter and send the notification to topicA, while it won't match the filter of notifB.
when uploading an object to that bucket in siteB the opposite will happen.


On Mon, Sep 16, 2024 at 7:14 PM Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx<mailto:alexhus@xxxxxxxxxxxxx>> wrote:
This is really an application level problem, but it's not trivial for me to determine the name of the remote site. So while I could add a metadata header and include the site name, then use "x-amz-metadata-site" == remote_site_name as my filter, it would be much more practical to say "x-amz-metadata-site" != local_site_name, if that were possible. I suspect that makes it firmly not your problem 🙂 but it does push me towards v1.

That's what I assumed re the notifications. However, I did delete all the notifications and topics before disabling the feature and did still end up with the nesting shown below in the "ragosgw-admin topic list" command output.

I also reproduced that installing a fresh Cluster at V19.1.1, immediately disabling notifications_v2 and then creating topics. although without the duplication this time.

$ radosgw-admin topic list
{
    "topics": [
        {
            "topics": [
                {
                    "owner": "zone.user",
                    "name": "ahk2",
                    "dest": {
                        "push_endpoint": "https://1.2.3.4:1234/";,
                        "push_endpoint_args": "verify-ssl=false",
                        "push_endpoint_topic": "ahk2",
                        "stored_secret": false,
                        "persistent": true,
                        "persistent_queue": ":ahk2",
                        "time_to_live": "None",
                        "max_retries": "None",
                        "retry_sleep_duration": "10"
                    },
                    "arn": "arn:aws:sns:geored_zg::ahk2",
                    "opaqueData": "",
                    "policy": ""
                }
            ]
        }
    ]
}

So maybe a bug in Squid?

________________________________
From: Yuval Lifshitz <ylifshit@xxxxxxxxxx<mailto:ylifshit@xxxxxxxxxx>>
Sent: Monday, September 16, 2024 4:42 PM
To: Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx<mailto:alexhus@xxxxxxxxxxxxx>>
Cc: ceph-users <ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>>
Subject: Re: [EXTERNAL] Re:  Bucket Notifications v2 & Multisite Redundancy

inline

On Mon, Sep 16, 2024 at 5:13 PM Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx<mailto:alexhus@xxxxxxxxxxxxx>> wrote:
Yes - that's correct. Thanks for the suggestions.

I think the metadata suggestion probably does work, however it doesn't come easy for me as it seems not possible to do a negative match filter; what I would really like to do is setup a filter for objects where "x-amz-metadata-site" != "local-site-name".  Seems that's an AWS S3 limitation.


why do you need a negative match? the filter determines what is being notified not what to filter out.


Likely leaning to disabling notifications_v2 for now and submitting a tracker for adding the flag as you suggested. I tried that but got some internal error from the tracker system, so will try again later. I'm keen to have a go at implementing as it will give us a path forward when v1 is removed.

I'm curious if it's supported to go from notifications_v2 to v1?

while we automatically migrate topics and notifications from v1 to v2, we do not do the opposite.
if you had something on v2 and want to disable, best option is to delete all topics and notifications, disable v2 and define them again.

Suspect it might not be and the bottom is an artifact of that. I tested disabling notifications_v2 and seem to have confused the "radosgw-admin topic list" command, which now returns some heavily nested JSON which looks a bit odd. Functionally might still be OK. I'm on the 19.1.1 RC, so not a stable release. I've replaced the actual topic data with "..." below, but it appears I am listing 4 copies of the same 2 topics. e.g.:

$ radosgw-admin topic list
{
  "topics": [    {
      "topics": [ ..., ... ],
    },
  "topics": [
    {
      "topics": [ ..., ... ],
    },
  "topics": [
    {
      "topics": [ ..., ... ],
    },
  "topics": [
    {
      "topics": [ ..., ... ],
    }
  ]
}

Thanks,
Alex

________________________________
From: Yuval Lifshitz <ylifshit@xxxxxxxxxx<mailto:ylifshit@xxxxxxxxxx>>
Sent: Monday, September 16, 2024 1:25 PM
To: Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx<mailto:alexhus@xxxxxxxxxxxxx>>
Cc: ceph-users <ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>>
Subject: Re: [EXTERNAL] Re:  Bucket Notifications v2 & Multisite Redundancy

just to see that i got it right. you are  asking to disable the topic and notification replication. as you want to send to different topics based on the zone that got the update to the bucket?

* one option is to disable "notifications_v2" on the zonegroup. but, this is probably not a good idea - there are other benefits to v2, and the v1 format will be deprecated eventually.
* if you can control the client apps on each site, maybe you can add a metadata attribute (e.g. z-amz-...) to the requests (or maybe you already have one?) and filter the notifications based on that attribute?

* please consider submitting a tracker for adding a "replication" flag to the notification configuration. so we can disable the replication (without disabling v2 notifications) on a granular level




On Mon, Sep 16, 2024 at 1:18 PM Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx<mailto:alexhus@xxxxxxxxxxxxx>> wrote:
Following up on this, I've run into another issue during my prototyping. I have two Ceph Clusters, with a zone each, sharing a zonegroup and realm. I have a local application to each, that needs informed of replication changes.

So I've created a topic and notification per site. Given the notification_v2 feature, these are replicated and both sites are aware of both topics and both notifications. I'm testing by writing to the bucket on the "siteA" Cluster, and I can see Replication:Create events pushed to both topics on the "siteB" Cluster.

However, therein lies my problem; I don't  want the Ceph Cluster on siteB to attempt to notify siteA of the replication (in-fact it can't route to it in the network at all, so it fails, and a queue builds up, which will eventually block subsequent writes if not dealt with). I think this is new in the notification_v2 feature, as a side-effect of the notifications being multisite.

I had a look into filtering rules to see if there was a way to say filter by events in a specific zone, but didn't find anything. Is there a solution to this? Any suggestions very welcome 🙂

Kind regards,
Alex

________________________________
From: Yuval Lifshitz <ylifshit@xxxxxxxxxx<mailto:ylifshit@xxxxxxxxxx>>
Sent: Tuesday, September 3, 2024 2:56 PM
To: Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx<mailto:alexhus@xxxxxxxxxxxxx>>
Cc: ceph-users <ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>>
Subject: Re: [EXTERNAL] Re:  Bucket Notifications v2 & Multisite Redundancy

responded inline

On Tue, Sep 3, 2024 at 3:05 PM Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx<mailto:alexhus@xxxxxxxxxxxxx>> wrote:
Hi Yuval,

Thanks for the response. I did managed to disable the feature, however I hope you can understand my hesitancy to design our move away from pubsub onto a deprecated replacement (i.e. "notifications v1").


> agree. notification v2 provides better observability, e.g. list of buckets that use topic etc.

The difference between this and the other operations that require forwarding to the master site, for me, is that I don't rely on any of those operations for day-to-day function. For example, S3 user creation I do at deployment time and never again; so if I lose the ability to do that in the event of a site failure that's OK. Perhaps I just need to implement a process to ensure the remaining zone is quickly promoted on a site failure.


> this is probably the correct approach in multisite

A further concern of the new feature... My intention is to use the bucket notification feature to send events generated by the multisite sync to my application server. If the topic is now multisite, will all kicks be duplicated? For example consider:

SiteA receives an S3 write for object
SiteA writes to the event queue and asynchronously delivers the event
SiteB syncs object from siteA

we do not send regular notifications upon sync.
note that we have other type of notifications: s3:ObjectSynced:*/s3:Replication:* that are sent only when an object is synced

SiteB writes to the event queue and asynchronously delivers the event

Really I only want the second event in this example, which is what I get with the pubsub module, and I think is what I will get with "notifications_v1".

Best Wishes,
Alex
________________________________
From: Yuval Lifshitz <ylifshit@xxxxxxxxxx<mailto:ylifshit@xxxxxxxxxx>>
Sent: Tuesday, September 3, 2024 12:15 PM
To: Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx<mailto:alexhus@xxxxxxxxxxxxx>>
Cc: ceph-users <ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>>
Subject: [EXTERNAL] Re:  Bucket Notifications v2 & Multisite Redundancy

Hi Alex,
It should be possible to disable the v2 feature through an admin command.
radosgw-admin onegroup modify --disable-feature=notification_v2

Also note that when creating a new squid cluster, v2 is enabled by default. But, when upgrading an existing cluster, you need to call:
radosgw-admin onegroup modify --enable-feature=notification_v2
to enable v2.

However, we do encourage moving to v2, as there are many benefits to the new data model (besides site syncing).

Regarding the problem you encountered when the "master" site is down, how is this different from other operations that are forwarded to "master"?

Yuval

On Tue, Sep 3, 2024 at 11:31 AM Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx<mailto:alexhus@xxxxxxxxxxxxx>> wrote:
Hi Folks,

I see in the pending release notes for Squid a description of the "notification_v2" feature.
ceph/PendingReleaseNotes at main · ceph/ceph (github.com<http://github.com/>)<https://github.com/ceph/ceph/blob/main/PendingReleaseNotes#L43>

I have some concerns about the multisite nature of this feature, I am using Ceph multisite to provide geographic redundancy for my application data. That seems to be a mainline use case. My application is designed to cope if an entire zone fails.

My application currently relies on pubsub (and soon to be replaced by bucket notifications) to provide service. More specifically I rely on the ability to semi-regularly automatically reconfigure topics to adjust the HTTP endpoint.

I believe with the "notification_v2" feature all adjustments of the topic are routed through the master for metadata site (much like creating an S3 user). Fundamentally I think this breaks the combination of Ceph multisite, notifications_v2 and reconfiguring a topic; you cannot reconfigure a topic unless both sites are up.

I tested this, by setting up a deployment, shutting down a site and trying to reconfigure a topic. I get a 500 HTTP response to my createTopic, and can see this in the RGW logs:
debug 2024-09-03T08:24:39.997+0000 7f53a3372640 20 ERROR: curl error: Couldn't connect to server req_data->error_buf=Failed to connect to 10.225.20.19 port 7480: No route to host
debug 2024-09-03T08:24:39.997+0000 7f5329a7f640  4 req 2968544598970267460 0.686004043s sns:pubsub_topic_create CreateTopic forward_request_to_master returned ret = -2200

Have the implications of this been considered? For now I may be able to disable the notifications_v2 feature, however in the release note the predecessor is deprecated in Squid. Hopefully I'm missing something obvious.

Kind regards,
Alex
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx>

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux