Re: [ext] Re: Rename / change host names set with `ceph orch host add`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



To clarify a bit, "ceph orch host rm <hostname> --force" won't actually
touch any of the daemons on the host. It just stops cephadm from managing
the host. I.e. it won't add/remove daemons on the host. If you remove the
host then re-add it with the new host name nothing should actually happen
to the daemons there. The only possible exception is if you have services
whose placement uses count and one of the daemons from that service is on
the host being temporarily removed. It's possible it could try to deploy
that daemon on another host in the interim. However, OSDs are never like
that so there would never be any need for flags like no-out or no-backfill.
The worst case would be it moving a mon or mgr around. If you make sure all
the important services are deployed by label, explicit hosts etc. (just not
count) then there should be no risk of any daemons moving at all and this
is a pretty safe operation.

On Fri, May 20, 2022 at 3:36 AM Kuhring, Mathias <
mathias.kuhring@xxxxxxxxxxxxxx> wrote:

> Hey Adam,
>
> thanks for your fast reply.
>
> That's a bit more invasive and risky than I was hoping for.
> But if this is the only way, I guess we need to do this.
>
> Would it be advisable to put some maintenance flags like noout,
> nobackfill, norebalance?
> And maybe stop the ceph target on the host I'm re-adding to pause all
> daemons?
>
> Best, Mathias
> On 5/19/2022 8:14 PM, Adam King wrote:
>
> cephadm just takes the hostname given in the "ceph orch host add" commands
> and assumes it won't change. The FQDN names (or whatever "ceph orch host
> ls" shows in any scenario) are from whatever input was given in those
> commands. Cephadm will even try to verify the hostname matches what is
> given when adding the host. As for where it is stored, we keep that info in
> the mon key store and it isn't meant to be manually updated (ceph
> config-key get mgr/cephadm/inventory). Although, there have occasionally
> been people running into issues related to a mismatch between an FQDN and a
> shortname. There's no built-in command for changing a hostname because of
> the expectation that it won't change. However, you should be able to fix
> this by removing and re-adding the host. E.g. "ceph orch host rm
> osd-mirror-1.our.domain.org" followed by "ceph orch host add osd-mirror-1
> 172.16.62.22 --labels rgw --labels osd". If you're on a late enough version
> that it requests you drain the host before we'll remove it (it was some
> pacific dot release, don't remember which one) you can pass --force to the
> host rm command. Generally it's not a good idea to remove hosts from
> cephadm's control while there are still cephadm deployed daemons on it like
> that but this is a special case. Anyway, removing and re-adding the host is
> the only (reasonable) way to change what it has stored for the hostname
> that I can remember.
>
> Let me know if that doesn't work,
>  - Adam King
>
> On Thu, May 19, 2022 at 1:41 PM Kuhring, Mathias <
> mathias.kuhring@xxxxxxxxxxxxxx> wrote:
>
>> Dear ceph users,
>>
>> one of our cluster is complaining about plenty of stray hosts and
>> daemons. Pretty much all of them.
>>
>> [WRN] CEPHADM_STRAY_HOST: 6 stray host(s) with 280 daemon(s) not managed
>> by cephadm
>>      stray host osd-mirror-1 has 47 stray daemons:
>> ['mgr.osd-mirror-1.ltmyyh', 'mon.osd-mirror-1', 'osd.1', ...]
>>      stray host osd-mirror-2 has 46 stray daemons: ['mon.osd-mirror-2',
>> 'osd.0', ...]
>>      stray host osd-mirror-3 has 48 stray daemons:
>> ['cephfs-mirror.osd-mirror-3.qzcuvv', 'mgr.osd-mirror-3',
>> 'mon.osd-mirror-3', 'osd.101', ...]
>>      stray host osd-mirror-4 has 47 stray daemons:
>> ['mds.cephfs.osd-mirror-4.omjlxu', 'mgr.osd-mirror-4', 'osd.103', ...]
>>      stray host osd-mirror-5 has 46 stray daemons: ['mgr.osd-mirror-5',
>> 'osd.139', ...]
>>      stray host osd-mirror-6 has 46 stray daemons:
>> ['mds.cephfs.osd-mirror-6.hobjsy', 'osd.141', ...]
>>
>> It all seems to boil down to host names from `ceph orch host ls` not
>> matching with other configurations.
>>
>> ceph orch host ls
>> HOST                                ADDR          LABELS STATUS
>> osd-mirror-1.our.domain.org  172.16.62.22  rgw osd
>> osd-mirror-2.our.domain.org  172.16.62.23  rgw osd
>> osd-mirror-3.our.domain.org  172.16.62.24  rgw osd
>> osd-mirror-4.our.domain.org  172.16.62.25  rgw mds osd
>> osd-mirror-5.our.domain.org  172.16.62.32  rgw osd
>> osd-mirror-6.our.domain.org  172.16.62.33  rgw mds osd
>>
>> hostname
>> osd-mirror-6
>>
>> hostname -f
>> osd-mirror-6.our.domain.org
>>
>> 0|0[root@osd-mirror-6 ~]# ceph mon metadata | grep "\"hostname\""
>>          "hostname": "osd-mirror-1",
>>          "hostname": "osd-mirror-3",
>>          "hostname": "osd-mirror-2",
>>
>> 0|1[root@osd-mirror-6 ~]# ceph mgr metadata | grep "\"hostname\""
>>          "hostname": "osd-mirror-1",
>>          "hostname": "osd-mirror-3",
>>          "hostname": "osd-mirror-4",
>>          "hostname": "osd-mirror-5",
>>
>>
>> The documentation states, that "cephadm demands that the name of host
>> given via `ceph orch host add` equals the output of `hostname` on remote
>> hosts.".
>>
>>
>> https://docs.ceph.com/en/latest/cephadm/host-management/#fully-qualified-domain-names-vs-bare-host-names
>>
>>
>> https://docs.ceph.com/en/octopus/cephadm/concepts/?#fully-qualified-domain-names-vs-bare-host-names
>>
>> But it seems our cluster wasn't setup like this.
>>
>> How can I now change the host names which were assigend when adding the
>> hosts with `ceph orch host add HOSTNAME`?
>>
>> I can't seem to find any documentation on changing the host names which
>> are listed by `ceph orch host ls`.
>> All I can find is related to changing the actual name of the host in the
>> system.
>> The crush map also just contains the bare host names.
>> So, where are these FQDN names actually registered?
>>
>> Thank you for help.
>>
>> Best regards,
>> Mathias
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
> --
> Mathias Kuhring
>
> Dr. rer. nat.
> Bioinformatician
> HPC & Core Unit Bioinformatics
> Berlin Institute of Health at Charité (BIH)
>
> E-Mail:  mathias.kuhring@xxxxxxxxxxxxxx
> Mobile: +49 172 3475576
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux