Hello Marco, On Mon, Oct 23, 2017 at 05:48:10PM +0200, Marco Baldini - H.S. Amiata wrote: > Hello > > ceph-mon services do not restart in any node, yesterday I manually restarted > ceph-mon and ceph-mgr on every node and since them they did not restart > > *pve-hs-2$ systemctl status ceph-mon@pve-hs-2.service* > ceph-mon@pve-hs-2.service - Ceph cluster monitor daemon > Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: enabled) > Drop-In: /lib/systemd/system/ceph-mon@.service.d > └─ceph-after-pve-cluster.conf > Active:*active (running) since Sun 2017-10-22 12:04:22 CEST; 1 day 5h ago* > Main PID: 24825 (ceph-mon) > Tasks: 23 > CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@pve-hs-2.service > └─24825 /usr/bin/ceph-mon -f --cluster ceph --id pve-hs-2 --setuser ceph --setgroup ceph > > Oct 22 12:04:22 pve-hs-2 systemd[1]: Stopped Ceph cluster monitor daemon. > Oct 22 12:04:22 pve-hs-2 systemd[1]: Started Ceph cluster monitor daemon. > > *pve-hs-main$ systemctl status ceph-mon@pve-hs-main.service* > ceph-mon@pve-hs-main.service - Ceph cluster monitor daemon > Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: enabled) > Drop-In: /lib/systemd/system/ceph-mon@.service.d > └─ceph-after-pve-cluster.conf > Active:*active (running) since Sun 2017-10-22 12:08:59 CEST; 1 day 5h ago* > Main PID: 24857 (ceph-mon) > CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@pve-hs-main.service > └─24857 /usr/bin/ceph-mon -f --cluster ceph --id pve-hs-main --setuser ceph --setgroup ceph > > Oct 22 12:08:59 pve-hs-main systemd[1]: Started Ceph cluster monitor daemon. > > *pve-hs-3$ systemctl status ceph-mon@pve-hs-3.service* > ceph-mon@pve-hs-3.service - Ceph cluster monitor daemon > Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: enabled) > Drop-In: /lib/systemd/system/ceph-mon@.service.d > └─ceph-after-pve-cluster.conf > Active:*active (running) since Sun 2017-10-22 12:07:43 CEST; 1 day 5h ago* > Main PID: 13077 (ceph-mon) > Tasks: 23 > CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@pve-hs-3.service > └─13077 /usr/bin/ceph-mon -f --cluster ceph --id pve-hs-3 --setuser ceph --setgroup ceph > > > At 17:28 I have this in syslog / journal of pve-hs-2 > > Oct 23 17:38:47 pve-hs-2 kernel: [255282.309979] libceph: mon1 10.10.10.252:6789 session lost, hunting for new mon > > On same node, my ceph-mon.pve-hs-2.log at 17:38 is > https://pastebin.com/8BCUm5Mr > > Thanks > > > > > Il 23/10/2017 16:26, Alwin Antreich ha scritto: > > Does the ceph-mon services restart when the session is lost? > > What do you see in the ceph-mon.log on the failing mon node? > > > > -- > > Cheers, > > Alwin > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com What is in the other ceph/syslog log files? Please also check your dmesg, maybe there is something with your bond/LACP. -- Cheers, Alwin _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com