Re: Some ideas on changes to the lvm.sh agent (or new agents).

Jonathan Brassow <jbrassow@xxxxxxxxxx> · Wed, 3 Oct 2007 09:57:17 -0500

Great stuff!  Much of what you are describing I've thought about in  
the past, but just haven't had the cycles to work on.  You can see in  
the script itself, the comments at the top mention the desire to  
operate on the VG level.  You can also see a couple vg_* functions  
that simply return error right now, but were intended to be filled in.

Comments in-line.

On Sep 28, 2007, at 11:14 AM, Simone Gotti wrote:

Hi,

Trying to use a non cluster vg in redhat cluster I noticed that  
lvm.sh,
to avoid metadata corruption, is forcing the need of only one lv  
per vg.

I was thinking that other clusters don't have this limitation as they
let you just use a vg only on one node at a time (and also on one
service group at a time).

To test if this was possible with lvm2 I made little changes to lvm.sh
(just variables renames, use of vgchange instead of lvchange for tag
adding) and using the same changes needed to /etc/lvm/lvm.conf
(volume_list = [ "rootvgname", "@my_hostname" ]) looks like this idea
was working.

I can activate the vg and all of its volume only on the node with  
the vg
tagged with its hostname and the start on the other nodes is refused.

Now, will this idea be accepted? If so these are a list of possible
needed changes and other ideas:

*) Make <parameter name="vg_name" required="1"> also unique="1" or
better primary="1" and remove the parameter "name" as only one service
can use a vg.

Sounds reasonable.  Be careful when using those parameters though,  
they often result in cryptic error messages that are tough to  
follow.  I do checks in lvm.sh where possible to be able to give the  
user more information on what went wrong.

*) What vg_status should do?
a) Monitor all the LVs
	or
b) Check only the VG and use ANOTHER resource agent for every lv  
used by
the cluster? So I can create/remove/modify lvs on that vg that aren't
under rgmanager control without any error reported by the status
functions of the lvm.sh agent.
Also other clusters distinguish between vg and lv and they have 2
different agents for them.

This is were things get difficult.  It would be ok to modify lvs on  
that vg as long as it's on the same machine that has ownership.  Tags  
should prevent otherwise, so should be ok.

User would have to be careful (or barriers would have to prevent)  
users from assigning different LVs in the same VG to different  
services.  Otherwise, if a service fails (application level) and must  
be moved to a different machine, we would have to find a way to move  
all services associated with the VG to the next machine.  I think  
there are ways to mandate this (that service A stick with service B),  
but we would have to have a way to enforce it.

Creating two new agents will also leave the actual lvm.sh without
changes and keep backward compatibility for who is already using it.

Something like this (lets call lvm_vg and lvm_lv respectively the  
agents
for the vg and the lv):

<service name="foo">
  <lvm_vg vgname="vg01">
     <lvm_lv lvname="lv01/>
     <lvm_lv lvname="lv01/>
     <script .... />
  </lvm_vg>
</service>

I'd have to be convinced of this...  I'd hate to see two 'lvm_lv's  
that refer to the same VG in different services... or would lvm_lv  
somehow require a parent lvm_vg, and make lvm_vg have a 'unique=1'?   
In any case, we are forced to move an entire VG at a time, so I'm not  
sure it would make sense to break it up.

*) Another problem that is present just now is that lvm should be
changed to avoid any operation on a non activable vg or lv. In these
days you cannot be able to start a vg/lv as its not tagged with the
hostname but you can remove/resize it without any problem. :D

Yes, tags need to be tightened-up.  You should never be able to alter  
metadata of LVM from a machine that does not have permission.

Another improvement I'd like to see is automatic VG ownership.   
Basically, tighten-up the tagging as stated above and add a unique  
machine specific tag to new VGs by default.  (For clusters, this tag  
could be the cluster name.)  This does a number of things, but  
primarily:
1) restricts access to the machine (or cluster) that created the VG.
- now if you move a hard drive to a different machine, it won't  
conflict with the other hard drives there (which might cause  
confusion over which device is the actual root volume).
2) simplifies HA LVM (lvm.sh + rgmanager)
- there would no longer be a need to edit lvm.conf to add the machine  
name, etc...
One of the questions surrounding this is how do you generate the  
unique id and when?  You would probably need it upon installation...

There are many kinds of fault scenarios.  Thinking through each one  
of them will be important when trying to implement a by-VG model vs a  
by-LV model.  I usually stumble across an issue that gives me pause  
before hastily adding a new feature, but I would welcome further  
thought/code.

 brassow

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster