Hello, While testing our ceph cluster setup, I noticed a possible issue with the cluster/public network configuration being ignored for TCP session initiation. Looks like the daemons (mon/mgr/mds/osd) are all listening on the right IP address but are initiating TCP sessions from the wrong interfaces. Would it be possible to force ceph daemons to use the cluster/public IP addresses to initiate new TCP connections instead of letting the kernel chose? Some details below: We set everything up to use our "10.2.1.0/24" network: 10.2.1.x (x=node number 1,2,3) But we can see TCP sessions being initiated from "10.2.0.0/24" network. So the daemons are listening to the right IP addresses. root@nbs-vp-01:~# lsof -nPK i | grep ceph | grep LISTE ceph-mds 1541648 ceph 16u IPv4 8169344 0t0 TCP 10.2.1.1:6800 (LISTEN) ceph-mds 1541648 ceph 17u IPv4 8169346 0t0 TCP 10.2.1.1:6801 (LISTEN) ceph-mgr 1541654 ceph 25u IPv4 8163039 0t0 TCP 10.2.1.1:6810 (LISTEN) ceph-mgr 1541654 ceph 27u IPv4 8163051 0t0 TCP 10.2.1.1:6811 (LISTEN) ceph-mon 1541703 ceph 27u IPv4 8170914 0t0 TCP 10.2.1.1:3300 (LISTEN) ceph-mon 1541703 ceph 28u IPv4 8170915 0t0 TCP 10.2.1.1:6789 (LISTEN) ceph-osd 1541711 ceph 16u IPv4 8169353 0t0 TCP 10.2.1.1:6802 (LISTEN) ceph-osd 1541711 ceph 17u IPv4 8169357 0t0 TCP 10.2.1.1:6803 (LISTEN) ceph-osd 1541711 ceph 18u IPv4 8169362 0t0 TCP 10.2.1.1:6804 (LISTEN) ceph-osd 1541711 ceph 19u IPv4 8169368 0t0 TCP 10.2.1.1:6805 (LISTEN) ceph-osd 1541711 ceph 20u IPv4 8169375 0t0 TCP 10.2.1.1:6806 (LISTEN) ceph-osd 1541711 ceph 21u IPv4 8169383 0t0 TCP 10.2.1.1:6807 (LISTEN) ceph-osd 1541711 ceph 22u IPv4 8169392 0t0 TCP 10.2.1.1:6808 (LISTEN) ceph-osd 1541711 ceph 23u IPv4 8169402 0t0 TCP 10.2.1.1:6809 (LISTEN) Sessions to the other nodes use the wrong IP address: @nbs-vp-01:~# lsof -nPK i | grep ceph | grep 10.2.1.2 ceph-mds 1541648 ceph 28u IPv4 8279520 0t0 TCP 10.2.0.2:44180->10.2.1.2:6800 (ESTABLISHED) ceph-mgr 1541654 ceph 41u IPv4 8289842 0t0 TCP 10.2.0.2:44146->10.2.1.2:6800 (ESTABLISHED) ceph-mon 1541703 ceph 40u IPv4 8174827 0t0 TCP 10.2.0.2:40864->10.2.1.2:3300 (ESTABLISHED) ceph-osd 1541711 ceph 65u IPv4 8171035 0t0 TCP 10.2.0.2:58716->10.2.1.2:6804 (ESTABLISHED) ceph-osd 1541711 ceph 66u IPv4 8172960 0t0 TCP 10.2.0.2:54586->10.2.1.2:6806 (ESTABLISHED) root@nbs-vp-01:~# lsof -nPK i | grep ceph | grep 10.2.1.3 ceph-mds 1541648 ceph 30u IPv4 8292421 0t0 TCP 10.2.0.2:45710->10.2.1.3:6802 (ESTABLISHED) ceph-mon 1541703 ceph 46u IPv4 8173025 0t0 TCP 10.2.0.2:40164->10.2.1.3:3300 (ESTABLISHED) ceph-osd 1541711 ceph 67u IPv4 8173043 0t0 TCP 10.2.0.2:56920->10.2.1.3:6804 (ESTABLISHED) ceph-osd 1541711 ceph 68u IPv4 8171063 0t0 TCP 10.2.0.2:41952->10.2.1.3:6806 (ESTABLISHED) ceph-osd 1541711 ceph 69u IPv4 8178891 0t0 TCP 10.2.0.2:57890->10.2.1.3:6808 (ESTABLISHED) See below our cluster config: [global] auth_client_required = cephx auth_cluster_required = cephx auth_service_required = cephx cluster_network = 10.2.1.0/24 fsid = 0f19b6ff-0432-4c3f-b0cb-730e8302dc2c mon_allow_pool_delete = true mon_host = 10.2.1.1 10.2.1.2 10.2.1.3 osd_pool_default_min_size = 2 osd_pool_default_size = 3 public_network = 10.2.1.0/24 [client] keyring = /etc/pve/priv/$cluster.$name.keyring [mds] keyring = /var/lib/ceph/mds/ceph-$id/keyring [mds.nbs-vp-01] host = nbs-vp-01 mds_standby_for_name = pve [mds.nbs-vp-03] host = nbs-vp-03 mds standby for name = pve [osd.0] public addr = 10.2.1.1 cluster addr = 10.2.1.1 [osd.1] public addr = 10.2.1.2 cluster addr = 10.2.1.2 [osd.2] public addr = 10.2.1.3 cluster addr = 10.2.1.3 [mgr.nbs-vp-01] public addr = 10.2.1.1 [mgr.nbs-vp-02] public addr = 10.2.1.2 [mgr.nbs-vp-03] public addr = 10.2.1.3 [mon.nbs-vp-01] public addr = 10.2.1.1 [mon.nbs-vp-02] public addr = 10.2.1.2 [mon.nbs-vp-03] public addr = 10.2.1.3 Cheers, Liviu _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx