Re: Cephadm not properly adding / removing iscsi services anymore

Eugen Block <eblock@xxxxxx> · Wed, 08 Sep 2021 08:12:00 +0000

If you only configured 1 iscsi gw but you see 3 running, have you  
tried to destroy them with 'cephadm rm-daemon --name ...'? On the  
active MGR host run 'journalctl -f' and you'll see plenty of  
information, it should also contain information about the iscsi  
deployment. Or run 'cephadm logs --name <iscsi-gw>'.

Zitat von "Paul Giralt (pgiralt)" <pgiralt@xxxxxxxxx>:

This was working until recently and now seems to have stopped  
working. Running Pacific 16.2.5. When I modify the deployment YAML  
file for my iscsi gateways, the services are not being added or  
removed as requested. It’s as if the state is “stuck”.

At one point I had 4 iSCSI gateways: 02, 03, 04 and 05. Through some  
back and forth of deploying and undeploying, I ended up in a state  
where the services are running on servers 02, 03, and 05 no matter  
what I tell cephadm to do. For example, right now I have the  
following configuration:

service_type: iscsi
service_id: iscsi
placement:
  hosts:
    - cxcto-c240-j27-03.cisco.com
spec:
  pool: iscsi-config
… removed the rest of this file ….

However ceph orch ls shows this:

[root@cxcto-c240-j27-01 ~]# ceph orch ls
NAME                               PORTS        RUNNING  REFRESHED   
AGE  PLACEMENT
alertmanager                       ?:9093,9094      1/1  9m ago      
3M   count:1
crash                                             15/15  10m ago    3M   *
grafana                            ?:3000           1/1  9m ago      
3M   count:1
iscsi.iscsi                                         3/1  10m ago     
11m  cxcto-c240-j27-03.cisco.com
mgr                                                 2/2  9m ago      
3M   count:2
mon                                                 5/5  9m ago      
12d   
cxcto-c240-j27-01.cisco.com;cxcto-c240-j27-06.cisco.com;cxcto-c240-j27-08.cisco.com;cxcto-c240-j27-10.cisco.com;cxcto-c240-j27-12.cisco.com
node-exporter                      ?:9100         15/15  10m ago    3M   *
osd.dashboard-admin-1622750977792                  0/15  -          3M   *
osd.dashboard-admin-1622751032319               326/341  10m ago    3M   *
prometheus                         ?:9095           1/1  9m ago      
3M   count:1

Notice it shows 3/1 because the service is still running on 3  
servers even though I’ve told it to only run on one. If I configure  
all 4 servers and apply (ceph orch apply) then I end up with 3/4  
because server 04 never deploys. It’s as if something is “stuck”.

Any ideas where to look / log files that might help figure out  
what’s happening?

-Paul

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx