Re: Pacific Bug?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Does seem like a bug, actually in more than just this command. The `ceph
orch host ls` with the --label and/or --host-pattern flag just piggybacks
off of the existing filtering done for placements in service specs. I've
just taken a look and you actually can create the same behavior with the
placement of an actual service, for example, with

[ceph: root@vm-00 /]# ceph orch host ls
HOST   ADDR             LABELS  STATUS
vm-00  192.168.122.7    _admin
vm-01  192.168.122.171  foo
vm-02  192.168.122.147  foo
3 hosts in cluster

and spec

[ceph: root@vm-00 /]# cat ne.yaml
service_type: node-exporter
service_name: node-exporter
placement:
  host_pattern: 'vm-0[0-1]'

you get the expected placement on vm-00 and vm-01

[ceph: root@vm-00 /]# ceph orch ps --daemon-type node-exporter
NAME                 HOST   PORTS   STATUS         REFRESHED  AGE  MEM USE
 MEM LIM  VERSION  IMAGE ID      CONTAINER ID
node-exporter.vm-00  vm-00  *:9100  running (23s)    17s ago  23s    3636k
       -  1.5.0    0da6a335fe13  f83e88caa7e0
node-exporter.vm-01  vm-01  *:9100  running (21h)     2m ago  21h    16.1M
       -  1.5.0    0da6a335fe13  a5153c378449

but if I add label to the placement, while still leaving in the host pattern

[ceph: root@vm-00 /]# cat ne.yaml
service_type: node-exporter
service_name: node-exporter
placement:
  label: foo
  host_pattern: 'vm-0[0-1]'

you would expect to only get vm-01 at this point, as it's the only host
that matches both pieces of the placement, but instead you get both vm-01
and vm-02

[ceph: root@vm-00 /]# ceph orch ps --daemon-type node-exporter
NAME                 HOST   PORTS   STATUS         REFRESHED  AGE  MEM USE
 MEM LIM  VERSION  IMAGE ID      CONTAINER ID
node-exporter.vm-01  vm-01  *:9100  running (21h)     4m ago  21h    16.1M
       -  1.5.0    0da6a335fe13  a5153c378449
node-exporter.vm-02  vm-02  *:9100  running (23s)    18s ago  23s    5410k
       -  1.5.0    0da6a335fe13  ddd1e643e341

Looking at the scheduling implementation, it seems currently it selects
candidates based on attributes in this order: Explicit host list, label,
host pattern (with some additional handling for count that happens in all
cases). When it finds the first thing in that list, in this case the label,
that is present in the placement, it uses that to select the candidates and
then bails out without any additional filtering on the host pattern
attribute. Since the placement spec validation doesn't allow applying specs
with both host_pattern/label and an explicit host list, this case with the
label and host pattern is the only one you can hit where this is an issue,
and I guess was just overlooked. Will take a look at making a patch to fix
this.

On Tue, Feb 13, 2024 at 7:09 PM Alex <mr.alexey@xxxxxxxxx> wrote:

> Hello Ceph Gurus!
>
> I'm running Ceph Pacific version.
> if I run
> ceph orch host ls --label osds
> shows all hosts label osds
> or
> ceph orch host ls --host-pattern host1
> shows just host1
> it works as expected
>
> But combining the two the label tag seems to "take over"
>
> ceph orch host ls --label osds --host-pattern host1
> 6 hosts in cluster who had label osds whose hostname matched host1
> shows all host with the label osds instead of only host1.
> So at first the flags seem to act like an OR instead of an AND.
>
> ceph orch host ls --label osds --host-pattern foo
> 6 hosts in cluster who had label osds whose hostname matched foo
> even though "foo" doesn't even exist
>
> ceph orch host ls --label bar --host-pattern host1
> 0 hosts in cluster who had label bar whose hostname matched host1
> if the label and host combo was an OR this should have worked
> there is no label bar but host1 exists so it just disregards the
> host-pattern.
>
> This started because the osd deployment task had both label and
> host_pattern.
> The cluster was attempting to deploy OSDS on all the servers with the
> given tag instead of the one host we needed,
> which caused it to go into warning state.
> If I ran
> ceph orch ls --export --service_name host1
> it also showed both tags and host_pattern.
> unmanaged: false
> placement:
>   host_pattern:
>   label:
> The issue persisted until I removed the label tag.
>
> Thanks.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux