indirectly related - pacemaker service

lejeczek <peljasz@xxxxxxxxxxx> · Wed, 29 May 2019 17:23:27 +0100

hi guys,

something I was hoping one expert could shed bit more light onto - I
have a pacemaker cluster composed of three nodes. One one always has a
problem with pacemaker - it's tools would say thing like:

$ crm_mon --one-shot
Connection to cluster failed: Transport endpoint is not connected
$ pcs status --all
Error: cluster is not currently running on this node

but systemd reports relevant demons as up and running with on tiny
exceptions! On "working" nodes it's:

$ systemctl status -l pacemaker
● pacemaker.service - Pacemaker High Availability Cluster Manager
   Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; disabled;
vendor preset: disabled)
   Active: active (running) since Fri 2019-05-10 15:39:40 BST; 2 weeks 5
days ago
     Docs: man:pacemakerd

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/index.html

 Main PID: 28664 (pacemakerd)
   CGroup: /system.slice/pacemaker.service
           ├─  28664 /usr/sbin/pacemakerd -f
           ├─  28670 /usr/libexec/pacemaker/cib
           ├─  28671 /usr/libexec/pacemaker/stonithd
           ├─  28672 /usr/libexec/pacemaker/lrmd
           ├─  28673 /usr/libexec/pacemaker/attrd
           ├─  28674 /usr/libexec/pacemaker/pengine
           ├─  28676 /usr/libexec/pacemaker/crmd
           ├─1503698 /bin/sh /usr/lib/ocf/resource.d/heartbeat/LVM monitor
           ├─1503717 /bin/sh /usr/lib/ocf/resource.d/heartbeat/LVM monitor
           ├─1503718 vgs -o tags --noheadings equalLogic-2.2
           └─1503719 tr -d  

but on that one single failing node:

$ systemctl status -l pacemaker.service 
● pacemaker.service - Pacemaker High Availability Cluster Manager
   Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; enabled;
vendor preset: disabled)
   Active: active (running) since Wed 2019-05-29 17:08:40 BST; 2min 19s ago
     Docs: man:pacemakerd

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/index.html

 Main PID: 48729 (pacemakerd)
    Tasks: 1
   Memory: 3.3M
   CGroup: /system.slice/pacemaker.service
           └─48729 /usr/sbin/pacemakerd -f

May 29 17:08:41 rider.private pacemakerd[48729]:   notice: Tracking
existing cib process (pid=39234)
May 29 17:08:41 rider.private pacemakerd[48729]:   notice: Tracking
existing stonithd process (pid=39235)
May 29 17:08:41 rider.private pacemakerd[48729]:   notice: Tracking
existing lrmd process (pid=39236)
May 29 17:08:41 rider.private pacemakerd[48729]:   notice: Tracking
existing attrd process (pid=39238)
May 29 17:08:41 rider.private pacemakerd[48729]:   notice: Tracking
existing pengine process (pid=39240)
May 29 17:08:41 rider.private pacemakerd[48729]:   notice: Tracking
existing crmd process (pid=39241)
May 29 17:08:41 rider.private pacemakerd[48729]:   notice: Quorum acquired

You can clearly see the difference, right? Systems are virtually
identical, same Dell's server model, same Centos 7.6 and packages from
same default repos.

Does that difference between systemds status for pacemaker signify anything?

many thanks, L.

Attachment:
pEpkey.asc

Description: application/pgp-keys
_______________________________________________
systemd-devel mailing list
systemd-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/systemd-devel