Re: Questions / doubts about rgw users and zones

Arno Lehmann <al@xxxxxxxxxxxxxx> · Wed, 30 Mar 2022 23:19:26 +0200

Hi Ulrich, all,

took me a while to get back to this, which was because I got as slow 
with $JOB as my Ceph clusters are in general :-)

Am 19.03.2022 um 20:26 schrieb Ulrich Klein:
Hi,

I'm not the expert, either :) So if someone with more experience wants to correct me, that’s fine.

At least that allowed me to notice that nobody found anything to correct.

But I think I have a similar setup with a similar goal.

I have two clusters, purely for RGW/S3.
I have a realm R in which I created a zonegroup ZG (not the low tax Kanton:) )

Actually, the taxes the company pays there are noticeable, as far as I 
understand.

On the primary cluster I have a zone ZA as master and on the second cluster a zone ZB.
With all set up including the access keys for the zones, metadata and data is synced between the two.

Users access only the primary cluster, the secondary is basically a very safe backup.

Indeed, that is what I needed as a development environment.

But I want - for some users - that their data is NOT replicated to that secondary cluster, cheaper plan or short lived data.

And that came later, when I decoided to actually learn what I can do 
with Ceph.

I found two ways to achieve that.
One is similar to what I understand is your setup:

It is, and it turns out that the explicit disabling of syncing and then 
selectively enabling seems to play an important role.

That is, after I fixed some missing credentials and also tweaked some 
endpoint settings and stuff.

After all my playing around, I'm not sure I could pinpoint where the 
manual was misleading, and where I was just implementing my own issues.

...
My alternative solution was to turn on/off synchronization on buckets:
For any existing (!) bucket one can simply turn off/on synchronization via
# radosgw-admin bucket sync [enable/disable] --bucket=<bucket>

Problem is that it only works on existing buckets. I've found no way to turn synchronization off by default, and even less what I actually need, which is turn synchronization/replication on/off per RGW user.

I discarded sync policies as they left the sync status in a suspicious state, were complicated in a strange way and the documentation "wasn't too clear to me"

Dunno, if this helps, and I'm pretty sure their may be better ways. But this worked for me.

In the end, this approach did work for me, and the manual and selective 
en- and disabling was somehwat important, it appears.

Ciao, Uli

PS:
I use s3cmd, rclone and cyberduck for my simple testing. aws cli I found more AWS-centric and it also doen't work well with Ceph/RGW tenants.

The applications using this storage system are a boto3-based python 
application and its testing framework. Up to now, $CUSTOMER has found no 
problem to complain about :-)

I've also started using rbd for virtual disk storage with Proxmox, and 
found no issues so far, so I will not venture into new lands for now, 
but thanks for the advice.

And, I'm not sure why you have so many endpoints in the zonegroup, but no load balancer a la RGW ingress, i.e. keepalived+haproxy. But that may be my lack of expertise.

Rather lack of my expertise... I did not want to deploy more systems 
than necessary, and while having three rgw heads is kind of overkill for 
my purposes, I'm very satified it works at all and I do not have to 
understand a load balancer and TLS endpoint :-)

Again, thanks for your advice; while it was not directly telling me what 
to do, it gave me the hints I needed!

Cheers,

Arno
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx