Re: [EXTERNAL] Re: Cephadm and the "--data-dir" Argument

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I think if it was locked in from bootstrap time it might not be that
complicated. We'd just have to store the directory paths in some
persistent location the module can access and make ceph cephadm mgr module
use them when calling out to the binary for any further actions. This does
have the slight issue that technically all the places cephadm uses for
storing persistent settings can be modified by the user (config options and
config-key store entries) although cephadm already has a number of other
config-key store entries it doesn't expect users to modify so that might be
fine. It gets more complicated if you allow users to change it post
bootstrap as we'd have to implement some kind of migration of the existing
dir on the hosts to the new location and make sure we don't make any calls
using the new location until the migration is completed. That would take
much more investigation.

On Mon, Aug 12, 2024 at 11:13 AM Alex Hussein-Kershaw (HE/HIM) <
alexhus@xxxxxxxxxxxxx> wrote:

> Thanks Adam - noted, I expect we can make something else work to meet our
> needs here.
>
> I don't know just how many monsters may be under the bed here - but if
> it's a fix that's appropriate for someone who doesn't know the Ceph
> codebase  (me) I'd be happy to have a look at implementing a fix.
>
> Best Wishes,
> Alex
>
> ------------------------------
> *From:* Adam King <adking@xxxxxxxxxx>
> *Sent:* Monday, August 12, 2024 4:05 PM
> *To:* Alex Hussein-Kershaw (HE/HIM) <alexhus@xxxxxxxxxxxxx>
> *Cc:* ceph-users <ceph-users@xxxxxxx>; Joseph Silva <
> t-josilva@xxxxxxxxxxxxx>
> *Subject:* [EXTERNAL] Re:  Cephadm and the "--data-dir"
> Argument
>
> Looking through the code it doesn't seem like this will work currently. I
> found that the --data-dir arg to the cephadm binary was from the initial
> implementation of the cephadm binary (so early that it was actually called
> "ceph-daemon" at the time rather than "cephadm") but it doesn't look like
> that worked included anything to connect it to the cephadm mgr module. So,
> after bootstrapping the cluster, whenever the cephadm mgr module calls out
> to the binary to deploy any daemon, it sets the data dir back to the
> default, hence why you're seeing the unit files being overwritten. This
> seems to be the case fo all of the `--<thing>-dir` parameters (unit, log,
> sysctl, logrotate, and data). We are doing the planning session for the
> next release tomorrow. I might add this as a topic to look into. But for
> now, unfortunately, it simply won't work without a large amount of manual
> effort.
>
> On Mon, Aug 12, 2024 at 10:06 AM Alex Hussein-Kershaw (HE/HIM) <
> alexhus@xxxxxxxxxxxxx> wrote:
>
> Hi Folks,
>
> I'm trying to use the --data-dir argument of cephadm when bootstrapping a
> Storage Cluster. It looks like exactly what I need, where my use case is
> that I want to data files onto a persistent disk, such that I can below
> away my VMs while retaining the files.
>
> Everything looks good and the bootstrap command completes. For reference I
> am running this command:
>
> "sudo cephadm --image "ceph/squid:v19.1.0" --docker --data-dir
> /cephconfig/var/lib/ceph bootstrap --mon-ip 10.235.22.23 --ssh-user
> qs-admin --ssh-private-key /home/qs-admin/.ssh/id_rsa --ssh-public-key
> /home/qs-admin/.ssh/id_rsa.pub --output-dir /cephconfig/etc/ceph
> --skip-dashboard --skip-monitoring-stack  --skip-pull --config my.conf"
>
> However, when I then try to continue with the deployment of my Storage
> Cluster, I find that I can't authenticate with the monitors. I run the
> suggested command to drop into a cephadm shell which then can't speak to
> the Storage Cluster. For example:
>
> $ ceph -s
> 2024-08-12T10:47:07.862+0000 7f998e59c640 -1 monclient(hunting):
> handle_auth_bad_method server allowed_methods [2] but i only support [2,1]
> [errno 13] RADOS permission denied (error connecting to the cluster)
>
> In the MON logs at the same time I can see:
> "cephx server client.admin: unexpected key: req.key=2c62e1471f111d12
> expected_key=d18ce06d18f116b4"
>
> In the systemd unit files created I see:
>
> ...
> ExecStart=/bin/bash
> /var/lib/ceph/64415fba-58b0-11ef-9d27-005056014e4f/%i/unit.run
> ExecStop=-/bin/bash -c 'bash
> /var/lib/ceph/64415fba-58b0-11ef-9d27-005056014e4f/%i/unit.stop'
> ExecStopPost=-/bin/bash
> /var/lib/ceph/64415fba-58b0-11ef-9d27-005056014e4f/%i/unit.poststop
> ...
>
> Which does not contain my data directory. Looking at the source template
> it appears that it should:
> ceph/src/cephadm/cephadmlib/templates/ceph.service.j2 at
> 616fbc1b181ce15e49281553b35ca215d2aa1053 · ceph/ceph (github.com)<
> https://github.com/ceph/ceph/blob/616fbc1b181ce15e49281553b35ca215d2aa1053/src/cephadm/cephadmlib/templates/ceph.service.j2#L22
> >
>
> Manually modifying the unit file, reloading systemd and restarting the mon
> makes the authentication issue go away, although cephadm seems to be
> periodically rewriting my file and undoing the changes. Is there a
> templating bug in here? I note that there are no other variables being
> templated from the ctx in this jinja2 template so it seems likely it is
> broken.
>
> Many thanks,
> Alex
>
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux