Hi, like reported in the previous mail I tried to implement a new VG based rgmanager agent for non-clustered lvm2 VGs. It's a sort of experiment, I just tried to implement in it some ideas and I'd like to get some suggestions on them just before going on working on it (it's quite probable that I had wrong assumption or missed some problems around). I did some tests with various scenarios. For example: All Machines that lost the same PVs; only one machine that lost PVs; removing an LVs, removing the whole VG etc... But to be sure that there aren't problem I have to do more and more tests and also try to document them. Below the most important points. Thanks! Bye! =================================================================================== What you can/cannot do with it: *) Only one service can own a VG. (this mimics other clusters) *) In the parameter lv_name you can define a space separated list of volumes to start/monitor. (Is this OCF/xml compliant?). If it's empty all the LVs are started/monitored. On stop all the volumes are stopped anyway. ===== How its implemented: *) Take control of the whole VG instead of a single LV by tagging the VG and not the LVs. *) Optional (can be removed if not wanted): I tried to let add additional tag to the VG/LVs. The tags used to "lock" to VGs to a node should be of type CMAN_NODE_${nodename}. So also the "volume_list" in /etc/lvm/lvm.conf should be changed in this way. *) Check if, for various reasons (manual intervention, race condition), the VG has multiple node tags on it. *) Check that the LVs aren't "node" tagged to avoid that they can be activated also on other nodes. *) The service is defined as unique="1" as only one service should use one VG. *) Shortened the status intervals (also for debugging pourposes, they can be increased to better values). ===== Bug found in current lvm.sh that I tried to fix: Note: Using mirror devices dmeventd runs "vgreduce --removemissing" only when a writes fails. During reads nothing is done by dmeventd on the VG so all programs like vgs/lvs will fail. *) If one or more PVs are missing when the service is started then vgs/lvs will return an error and not provide the tags. The scripts wrongly assumes that the vg (previously lv) isn't owned, tries to steal it but the tagging fails and so runs "vgreduce --removemissing". But the VG can be active on another machine. Solution: First check for missing pv, get tags in partial mode and if not owned try to fix it issuing a "vgreduce --removemissing" *) If one or more PVs are missing when the service is stopped then vgs will return an error and not provide the tags. The scripts wrongly assumes that the vg (previously lv) isn't owned, tries to steal it but the tagging fails and so the stop fails too putting the service in a failed status the need manual intervention. Solution: First check for missing pv, get tags in partial mode and if not owned try to fix it issuing a "vgreduce --removemissing" *) If one or more PVs are missing after the start, the status check will assume that the node isn't owning as it cannot get the tags and will try to change the vg, then it return an error that will bring to a service restart. Solution: First check for missing pv and if so return an error. Introduce a recover action that will do the same as start. This will fix the VG. *)If one machine loses some disks, dmeventd or the start/stop scripts will call vgreduce --removemissing. Then, if the service is switched to another machine that sees all the disks or the disks come back, for every lvm command launched a warning is issued "Inconsistent metadata found for VG %s - updating to use version %s". Solution: Avoid this by calling lvm_exec_resilient on all commands except on some vgs called with --partial or that needs to get the real state (vgs doesn't report "Inconsistent metadata found for VG %s - updating to use version %s" anyway so no problems). As I needed to get also some output echoed (returned) by lvm_exec_resilient I directed all the ocf_log output of lvm_exec_resilient to stderr. ===== Things to do: *)Maybe increase the start/stop/status timeouts (5 seconds looks too short for a "vgreduce --removemissing" an big VGs). *)Implement better validation of parameters. On Wed, 2007-10-03 at 09:57 -0500, Jonathan Brassow wrote: > Great stuff! Much of what you are describing I've thought about in > the past, but just haven't had the cycles to work on. You can see in > the script itself, the comments at the top mention the desire to > operate on the VG level. You can also see a couple vg_* functions > that simply return error right now, but were intended to be filled in. > brassow > -- Simone Gotti
Attachment:
lvm_vg.sh.20061003-2330
Description: application/shellscript
Attachment:
signature.asc
Description: This is a digitally signed message part
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster