Re: Multisite Pubsub - Duplicates Growing Uncontrollably

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alex,

I also seemed to miss your email :-)

On Mon, Oct 18, 2021 at 11:32 AM Alex Kershaw <alex.kershaw4@xxxxxxxxx>
wrote:

> Hi Yuval,
>
> Apologies - I'm having some trouble with my microsoft spam filter and I'm
> not sure this email reached you. If it did please excuse the duplicate.
> This is in response to:
> "Multisite Pubsub - Duplicates Growing Uncontrollably":
> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/DPPEPYPAWLQIRPRZAEJAWJ72S2W6INNN/
> .
>
> --------------------------------------------------------------------
>
> Hi Yuval,
>
>
>
> Thanks for the reply. Oddly it had not come through to my inbox and I’ve
> only just spotted it.
>
>
>
> We have 4 total zones, siteA, siteB, siteApubsub and siteBpubsub.
> Interesting that there is an issue, is there a ceph tracker ticket for this
> so I can keep an eye on it?
>

just opened a tracker https://tracker.ceph.com/issues/52963
feel free to comments there.


> As you mentioned sounds like this isn’t the cause though.
>
>
>

i never tried pubsub with more than one pubsub zone. will investigate if
this is the root cause.
BTW, assuming all zones are in the same zonegroup, why do you have 2 pubsub
zones?

I have verified these are the same events yes – for the most duplicated
> event, every single mtime attribute is the same. I don’t see an etag field,
> but everything is the same between separate events referencing the same
> object except the timestamp + id field.  The data I see looks like this:
>
>
>
>    {
>
>       "id": "1633954796.196156.775b109c",
>
>       "event": "OBJECT_CREATE",
>
>       "timestamp": "2021-10-11T12:19:56.196156Z",
>
>       "info": {
>
>         "attrs": {
>
>           "mtime": "2020-08-10T16:10:48.749795Z"
>
>         },
>
>         "bucket": {
>
>           "bucket_id": "b72446af-3ff1-4164-b91e-5bf72d72c2a9.8443461.1",
>
>           "name": "albansstack-scsdata",
>
>           "tenant": ""
>
>         },
>
>         "key": {
>
>           "instance": "4redKHrSif4Bs6nxWVRMHrWC5G1Quxt",
>
>           "name": "61/00/2020020801511142F85432289692-Subscriber"
>
>         }
>
>       }
>
>     },
>

agree. if mtime is the same then it is probably the same object. as the
"timestamp" and "event-id" are generated when the event is sent.


>
>
>
> Your comment on the RGW restarts is interesting, but we’re not restarting
> these anymore – however I’m still seeing objects that I’m not expecting. I
> had a look at the RGW logs and don’t see anything implying RGW sync isn’t
> functioning as normal.
>
>
>
> The biggest surprise to me is that the mtimes of the objects are all old.
> My cluster’s “radosgw-admin sync status” was reporting that the data sync
> was completed this morning, and I manually acked everything in the pubsub
> queue. Now I am seeing more pubsub events with mtimes such as:
> "2020-08-10T16:10:48.749795Z" as above – I’m curious as to why this can
> appear in pubsub – I think the mtime is saying this object hasn’t been
> updated since 2020-08-10, so why is it on the pubsub queue at all if I had
> a complete sync this morning and an empty queue? Perhaps I’m
> misunderstanding something here, any insight you can provide is greatly
> appreciated 😊
>
>
>
> We’re using pubsub based notifications as our design makes use of both
> getting kicks to an endpoint and using the API to retrieve a queue of all
> unacknowledged events (it’s important for us that we don’t miss any events
> – even if our product goes down temporarily).  I think this reasoning is
> inline with the doc you linked.
>

agree that this is an important feature, as it allows you to overcome
outages not only in ceph, but also in your system.
hence the idea is not to deprecate the "pull" functionality - but to
replace it with a mechanism unrelated to zone syncing.


> I actually spotted your email regarding pubsub deprecation (which I
> presume is the reason for asking) just this morning – I think someone from
> my team was intending to get in touch with you regarding this.
>
>
>

I replied to Dave Piper on an email thread titled "RGW pubsub deprecation".
also with the drawbacks of the mechanism being based on zone synching


> Thanks,
>
> Alex
>
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux