Re: [Linux-cluster] DLM behavior after lockspace recovery

Daniel Phillips <phillips@xxxxxxxxxx> · Sat, 16 Oct 2004 20:50:04 -0400

On Saturday 16 October 2004 17:14, Jeff wrote:
> In your example of a counter which tracks the # of operations
> in progress, regenerating the LVB value during failover from
> the last known good value among the surviving nodes doesn't
> do any good. There is no way to avoid recalculating the correct
> value during the failover process.
>
> OTOH, in my example where the value in the lock value block
> is used as a block version # it makes perfect sense to use
> the last known value from the surviving nodes.

Going back over the early part of the thread, you weren't originally 
advocating this, you just thought that might be the way vaxcluster did 
it, and you thought RSB$L_VALSEQNUM might have something to do with it.

Let's keep hunting around for a way of handling this with flags alone, 
ok?  Sequencing the lvbs is a rather, ahem, heavyweight approach that 
consumes memory in every lvb user (more probably, every dlm user 
whether they use lvbs or not) and benefits only a small subset of lvb 
applications.  This extra sequence number has to travel over the net, 
through sockets and through various other interface bits.  More bloat 
in this department has to be seen as a bad thing.

It seems to me that the VALNOTVALID flag by itself isn't enough for you 
because one of your nodes might update the lvb, and consequently some 
other node may not ever see the VALNOTVALID flag, and therefore not 
know that it should reset its cached counter.  So how about an 
additional flag, say, INVALIDATED, that the lock master hands out to 
any lvb reader the first time it reads an lvb for which recovery was 
not possible, whether the lvb was subsequently written or not.  Your 
application looks at INVALIDATED to know that it has to reset its 
counter and ignores VALNOTVALID.  Does this work for you?

I agree that VALNOTVALID is a useful flag that we should have, but isn't 
useful here.  I also agree with your discomfort about setting the lvb 
arbitrarily to zero, but having a flag to detect that reduces the 
annoyance considerably, don't you think?

> Another example is a lock who's LVB doesn't change once it has
> been initialized. In this case it doesn't matter whether the
> value block is marked invalid or not. The contents are still
> useful.

But this value could still disappear if all the readers drop the lock, 
so presumably you have a way of recovering it, in which case the above 
flags would work for this problem as well.

Also, in this case, randomly picking a value from one of the surviving 
nodes (and setting VALNOTVALID) will do as well as a sequence number 
scheme.  Such an api change would require changes to the gfs harness 
plugin, which isn't such a big deal at this point since gfs+gdlm is 
still pre-alpha.

Regards,

Daniel