Re: Continuous error: "libceph: monX session lost, hunting for new mon" on one host

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





Il 30/10/2017 10:31, Alwin Antreich ha scritto:
Hello Marco,

On Mon, Oct 23, 2017 at 05:48:10PM +0200, Marco Baldini - H.S. Amiata wrote:
Hello

ceph-mon services do not restart in any node, yesterday I manually restarted
ceph-mon and ceph-mgr on every node and since them they did not restart

*pve-hs-2$ systemctl status ceph-mon@pve-hs-2.service*
  ceph-mon@pve-hs-2.service - Ceph cluster monitor daemon
    Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: enabled)
   Drop-In: /lib/systemd/system/ceph-mon@.service.d
            └─ceph-after-pve-cluster.conf
    Active:*active (running) since Sun 2017-10-22 12:04:22 CEST; 1 day 5h ago*
  Main PID: 24825 (ceph-mon)
     Tasks: 23
    CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@pve-hs-2.service
            └─24825 /usr/bin/ceph-mon -f --cluster ceph --id pve-hs-2 --setuser ceph --setgroup ceph

Oct 22 12:04:22 pve-hs-2 systemd[1]: Stopped Ceph cluster monitor daemon.
Oct 22 12:04:22 pve-hs-2 systemd[1]: Started Ceph cluster monitor daemon.

*pve-hs-main$ systemctl status ceph-mon@pve-hs-main.service*
  ceph-mon@pve-hs-main.service - Ceph cluster monitor daemon
    Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: enabled)
   Drop-In: /lib/systemd/system/ceph-mon@.service.d
            └─ceph-after-pve-cluster.conf
    Active:*active (running) since Sun 2017-10-22 12:08:59 CEST; 1 day 5h ago*
  Main PID: 24857 (ceph-mon)
    CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@pve-hs-main.service
            └─24857 /usr/bin/ceph-mon -f --cluster ceph --id pve-hs-main --setuser ceph --setgroup ceph

Oct 22 12:08:59 pve-hs-main systemd[1]: Started Ceph cluster monitor daemon.

*pve-hs-3$ systemctl status ceph-mon@pve-hs-3.service*
  ceph-mon@pve-hs-3.service - Ceph cluster monitor daemon
    Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor preset: enabled)
   Drop-In: /lib/systemd/system/ceph-mon@.service.d
            └─ceph-after-pve-cluster.conf
    Active:*active (running) since Sun 2017-10-22 12:07:43 CEST; 1 day 5h ago*
  Main PID: 13077 (ceph-mon)
     Tasks: 23
    CGroup: /system.slice/system-ceph\x2dmon.slice/ceph-mon@pve-hs-3.service
            └─13077 /usr/bin/ceph-mon -f --cluster ceph --id pve-hs-3 --setuser ceph --setgroup ceph


At 17:28 I have this in syslog / journal of pve-hs-2

Oct 23 17:38:47 pve-hs-2 kernel: [255282.309979] libceph: mon1 10.10.10.252:6789 session lost, hunting for new mon

On same node, my ceph-mon.pve-hs-2.log at 17:38 is
https://pastebin.com/8BCUm5Mr

Thanks




Il 23/10/2017 16:26, Alwin Antreich ha scritto:
Does the ceph-mon services restart when the session is lost?
What do you see in the ceph-mon.log on the failing mon node?

--
Cheers,
Alwin

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
What is in the other ceph/syslog log files? Please also check your
dmesg, maybe there is something with your bond/LACP.


Actually after some server reboots, the problem seems solved by itself, that's strange because there have been no change in servers or network configurations

Only yesterday I had this in dmesg -xe

kern  :warn  : [Oct29 06:39] libceph: mon2 10.10.10.253:6789 socket closed (con state OPEN)
kern  :info  : [  +0.000029] libceph: mon2 10.10.10.253:6789 session lost, hunting for new mon
kern  :info  : [  +0.031530] libceph: mon0 10.10.10.251:6789 session established


On the other nodes at that time there are no warnings or errors.

I think the problem is solved, I don't know how, but ceph is running fine now.

Thanks











_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux