On Tue, Sep 16, 2008 at 01:29, Jeff Stoner <jstoner@xxxxxxxxxxxx> wrote: >> -----Original Message----- >> It is also the detail of status/monitor which implementers get most >> frequently wrong. "But it's either running or not!" ... Which >> is clearly >> not true, or at least such a case couldn't protect against certain >> failure modes. (Such as multiple-active on several nodes, which is >> likely to be _also_ failed.) > > Ok. I think I understand where the confusion lies. > > LSB is strictly for init scripts. > OCF is strictly for a cluster-managed resource. This is an unnecessary distinction. A true LSB script, by which I mean one that follows _all_ the LSB guidelines for status^, is perfectly adequate for clustering. And an OCF script that implements sane parameter defaults can, for the most part, be easily used as an init script for stopping and starting services. ^ http://refspecs.linux-foundation.org/LSB_3.2.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html [quote] If the status action is requested, the init script will return the following exit status codes. 0 program is running or service is OK 1 program is dead and /var/run pid file exists 2 program is dead and /var/lock lock file exists 3 program is not running 4 program or service status is unknown 5-99 reserved for future LSB use 100-149 reserved for distribution use 150-199 reserved for application use 200-254 reserved [end quote] > > They are similar but have significant differences. For example, LSB > scripts are required to implement a 'status' action while OCF scripts > are required to implement a 'monitor' action. Btw. Lars was one of the primary authors of the OCF spec... he's pretty familiar with it and how it differs from LSB ;-) > This difference alone > means, technically, you can't interchange LSB and OCF scripts unless > they implement both (in some fashion.) > > I think this is the missing link in our conversation: the script > resource type in Cluster Services is an attempt to make a LSB-compliant > script into a OCF-compliant script. So, the /usr/share/cluster/script.sh > expects the script you specify to behave like an LSB script, not an OCF > script. As such, the script resource type falls back to LSB conventions > and uses a binary approach to a resource's start/stop/status actions: > zero for success and non-zero for any failure. Other resource types > (file system, nfs, ip, mysql, samba, etc.) may implement full OCF RA API > exit codes. > > Does this help? I'm guessing this was what Lars was asking about. In case of interest, our equivalent is the LRMd which understands both standards (as well as the old Heartbeat style ones) and hides any differences from OCF. So any cluster manager using it can treat everything as an OCF resource. Part of our resource definition is a "class" field which the admin uses to tell the LRMd what standard to use for a given resource and it will automagically map everything from and to OCF as required (eg. by calling status instead of monitor for LSB and remapping the return codes). -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster