On 8/20/21 4:35 AM, Steven Whitehouse wrote:
Hi,
On Thu, 2021-08-19 at 21:40 +0200, Andreas Gruenbacher wrote:
From: Bob Peterson <rpeterso@xxxxxxxxxx>
This patch introduces a new HIF_MAY_DEMOTE flag and infrastructure
that
will allow glocks to be demoted automatically on locking conflicts.
When a locking request comes in that isn't compatible with the
locking
state of a holder and that holder has the HIF_MAY_DEMOTE flag set,
the
holder will be demoted automatically before the incoming locking
request
is granted.
I'm not sure I understand what is going on here. When there are locking
conflicts we generate call backs and those result in glock demotion.
There is no need for a flag to indicate that I think, since it is the
default behaviour anyway. Or perhaps the explanation is just a bit
confusing...
I agree that the whole concept and explanation are confusing. Andreas
and I went through several heated arguments about the symantics,
comments, patch descriptions, etc. We played around with many different
flag name ideas, etc. We did not agree on the best way to describe the
whole concept. He didn't like my explanation and I didn't like his. So
yes, it is confusing.
My preferred terminology was "DOD" or "Dequeue On Demand" which makes
the concept more understandable to me. So basically a process can say
"I need to hold this glock, but for an unknown and possibly lengthy
period of time, but please feel free to dequeue it if it's in your way."
And bear in mind that several processes may do the same, simultaneously.
You can almost think of this as a performance enhancement. This concept
allows a process to hold a glock for much longer periods of time, at a
lower priority, for example, when gfs2_file_read_iter needs to hold the
glock for very long-running iterative reads.
The process requesting a holder with "Demote On Demand" must then
determine if its holder has been stolen away (dequeued on demand) after
its lengthy operation, and therefore needs to pick up the pieces of
where it left off in its process.
Meanwhile, another process may need to hold the glock. If its requested
mode is compatible, say SH and SH, the lock is simply granted with no
further delay. If the mode is incompatible, regardless of whether it's
on the local node or a different node in the cluster, these
longer-term/lower-priority holders may be dequeued or prempted by
another request to hold the glock. Note that although these holders are
dequeued-on-demand, they are never "uninitted" as part of the process.
Nor must they ever be, since they may be on another process's heap.
This differs from the normal glock demote process in which the demote
bit is set on ("requesting" the glock be demoted) but still needs to
block until the holder does its actual dequeue.
Processes that allow a glock holder to be taken away indicate this by
calling gfs2_holder_allow_demote(). When they need the glock again,
they call gfs2_holder_disallow_demote() and then they check if the
holder is still queued: if it is, they're still holding the glock; if
it
isn't, they need to re-acquire the glock.
This allows processes to hang on to locks that could become part of a
cyclic locking dependency. The locks will be given up when a (rare)
conflicting locking request occurs, and don't need to be given up
prematurely.
This seems backwards to me. We already have the glock layer cache the
locks until they are required by another node. We also have the min
hold time to make sure that we don't bounce locks too much. So what is
the problem that you are trying to solve here I wonder?
Again, this is simply allowing premption of lenghy/low-priority holders
whereas the normal demote process will only demote when the glock is
dequeued after this potentially very-long period of time.
The minimum hold time solves a different problem, and Andreas and I
talked just yesterday about possibly revisiting how that all works. The
problem with minimum hold time is that in many cases the glock state
machine does not want to grant new holders if the demote bit is on, so
it ends up wasting more time than solving the actual problem.
But that's another problem for another day.
Regards,
Bob Peterson