Re: Running different rgw daemon with same cephxuser

Matt Benjamin <mbenjami@xxxxxxxxxx> · Mon, 8 Feb 2021 09:52:55 -0500

HI Sebastien,

That seems like a concise and reasonable solution to me.  It seems
like the metrics from a single instance should in fact be transient
(leaving the problem of maintaining aggregate values to prometheus or
even downstream of that?

Matt

On Mon, Feb 8, 2021 at 9:47 AM Sebastien Han <shan@xxxxxxxxxx> wrote:
>
> Hi Jiffin,
>
> From my perspective, one simple way to fix this (although we must be
> careful with backward compatibility) would be for rgw to register to
> service map differently.
> Today it is using the daemon name like rgw.foo, then it will register
> as foo. Essentially, if you try to run that pod twice you would still
> see a single instance in the service map as well as the prometheus
> metrics.
>
> It would be nice to register with RADOS client session ID instead ,
> just like rbd-mirror does by using instance_id. Something like:
>
> std::string instance_id = stringify(rados->get_instance_id());
> int ret = rados.service_daemon_register(daemon_type, name, metadata);
>
> Here https://github.com/ceph/ceph/blob/master/src/rgw/rgw_rados.cc#L1139
> With that we can re-use the same cephx user and scale to any number,
> all instances will use the same cephx to authenticate to the cluster
> but they will show up as N in the service map.
>
> I guess one downside is that as soon as the daemon restart, we get a
> new RADOS client session ID, and thus our name changes, which means we
> are losing all the metrics...
> Thoughts?
>
> Thanks!
> –––––––––
> Sébastien Han
> Senior Principal Software Engineer, Storage Architect
>
> "Always give 100%. Unless you're giving blood."
>
> On Thu, Feb 4, 2021 at 3:39 PM Jiffin Thottan <jthottan@xxxxxxxxxx> wrote:
> >
> > Hi all,
> >
> > In OCS(Rook) env workflow for RGW daemons as follows,
> >
> > Normally for creating ceph object-store, the first Rook creates pools for rgw daemon with the specified configuration.
> >
> > Then depending on the no of instances, Rook create cephxuser and then rgw spawn daemon in the container(pod) using its id
> > with following arguments for radosgw binary
> >     Args:
> >       --fsid=91501490-4b55-47db-b226-f9d9968774c1
> >       --keyring=/etc/ceph/keyring-store/keyring
> >       --log-to-stderr=true
> >       --err-to-stderr=true
> >       --mon-cluster-log-to-stderr=true
> >       --log-stderr-prefix=debug
> >       --default-log-to-file=false
> >       --default-mon-cluster-log-to-file=false
> >       --mon-host=$(ROOK_CEPH_MON_HOST)
> >       --mon-initial-members=$(ROOK_CEPH_MON_INITIAL_MEMBERS)
> >       --id=rgw.my.store.a
> >       --setuser=ceph
> >       --setgroup=ceph
> >       --foreground
> >       --rgw-frontends=beast port=8080
> >       --host=$(POD_NAME)
> >       --rgw-mime-types-file=/etc/ceph/rgw/mime.types
> >       --rgw-realm=my-store
> >       --rgw-zonegroup=my-store
> >       --rgw-zone=my-store
> >
> > And here cephxuser will be "client.rgw.my.store.a" and all the pools for rgw will be created as my-store*. Normally if there is
> > a request for another instance in the config file for a ceph-object-store config file[1] for rook, another user "client.rgw.mystore.b"
> > will be created by rook and will consume the same pools.
> >
> > There is a feature in Kubernetes known as autoscale in which pods can be automatically scaled based on specified metrics. If we apply that
> > feature for rgw pods, Kubernetes will automatically scale the rgw pods(like a clone of the existing pod) with the same argument for "--id"
> > based on the metrics, but ceph cannot distinguish those as different rgw daemons even though multiple pods of rgw are running simultaneously.
> >  In "ceph status" shows only one daemon rgw as well
> >
> > In vstart or ceph ansible(Ali help me to figure it out), I can see for each rgw daemon a cephxuser is getting created as well
> >
> > Is this behaviour intended ? or am I hitting any corner case which was never tested before?
> >
> > There is no point of autoscaling of rgw pod if it considered to the same daemon, the s3 client will talk to only one of the pods and ceph mgr
> > provides metrics can give incorrect data as well which can affect the autoscale feature
> >
> > Also opened an issue in rook for the time being [2]
> >
> > [1] https://github.com/rook/rook/blob/master/cluster/examples/kubernetes/ceph/object-test.yaml
> > [2] https://github.com/rook/rook/issues/6943
> >
> > Regards,
> > Jiffin
> >
>

-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx