Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file

Jason Low <jason.low2@xxxxxx> · Thu, 26 Sep 2013 15:42:13 -0700

On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote:
> On Thu, 2013-09-26 at 14:09 -0700, Jason Low wrote:
> > On Thu, 2013-09-26 at 13:40 -0700, Davidlohr Bueso wrote:
> > > On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote:
> > > > On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote:
> > > > > On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote:
> > > > > > On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> wrote:
> > > > > > > We will need the MCS lock code for doing optimistic spinning for rwsem.
> > > > > > > Extracting the MCS code from mutex.c and put into its own file allow us
> > > > > > > to reuse this code easily for rwsem.
> > > > > > >
> > > > > > > Signed-off-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
> > > > > > > Signed-off-by: Davidlohr Bueso <davidlohr@xxxxxx>
> > > > > > > ---
> > > > > > >  include/linux/mcslock.h |   58 +++++++++++++++++++++++++++++++++++++++++++++++
> > > > > > >  kernel/mutex.c          |   58 +++++-----------------------------------------
> > > > > > >  2 files changed, 65 insertions(+), 51 deletions(-)
> > > > > > >  create mode 100644 include/linux/mcslock.h
> > > > > > >
> > > > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h
> > > > > > > new file mode 100644
> > > > > > > index 0000000..20fd3f0
> > > > > > > --- /dev/null
> > > > > > > +++ b/include/linux/mcslock.h
> > > > > > > @@ -0,0 +1,58 @@
> > > > > > > +/*
> > > > > > > + * MCS lock defines
> > > > > > > + *
> > > > > > > + * This file contains the main data structure and API definitions of MCS lock.
> > > > > > > + */
> > > > > > > +#ifndef __LINUX_MCSLOCK_H
> > > > > > > +#define __LINUX_MCSLOCK_H
> > > > > > > +
> > > > > > > +struct mcs_spin_node {
> > > > > > > +       struct mcs_spin_node *next;
> > > > > > > +       int               locked;       /* 1 if lock acquired */
> > > > > > > +};
> > > > > > > +
> > > > > > > +/*
> > > > > > > + * We don't inline mcs_spin_lock() so that perf can correctly account for the
> > > > > > > + * time spent in this lock function.
> > > > > > > + */
> > > > > > > +static noinline
> > > > > > > +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node)
> > > > > > > +{
> > > > > > > +       struct mcs_spin_node *prev;
> > > > > > > +
> > > > > > > +       /* Init node */
> > > > > > > +       node->locked = 0;
> > > > > > > +       node->next   = NULL;
> > > > > > > +
> > > > > > > +       prev = xchg(lock, node);
> > > > > > > +       if (likely(prev == NULL)) {
> > > > > > > +               /* Lock acquired */
> > > > > > > +               node->locked = 1;
> > > > > > 
> > > > > > If we don't spin on the local node, is it necessary to set this variable?
> > > > > 
> > > > > I don't follow, the whole idea is to spin on the local variable.
> > > > 
> > > > If prev == NULL, doesn't that mean it won't proceed to spin on the
> > > > variable because the lock is already free and we call return? In that
> > > > case where we directly acquire the lock, I was wondering if it is
> > > > necessary to set node->locked = 1.
> > > 
> > > Yes, that's true, but we need to flag the lock as acquired (the node's
> > > lock is initially set to unlocked), otherwise others trying to acquire
> > > the lock can spin forever:
> > > 
> > > 	/* Wait until the lock holder passes the lock down */
> > > 	while (!ACCESS_ONCE(node->locked))
> > > 		arch_mutex_cpu_relax();
> > > 
> > > The ->locked variable in this implementation refers to if the lock is
> > > acquired, and *not* to if busy-waiting is necessary.
> > 
> > hmm, others threads acquiring the lock will be spinning on their own
> > local nodes, not this node's node->locked. And if prev == NULL, the
> > current thread won't be reading it's node->lock either since we return.
> > So what other thread is going to be reading this node's node->lock?
> > 
> > Thanks,
> > Jason
> 
> I think setting node->locked = 1 for the prev==NULL case is not
> necessary functionally, but was done for semantics consistency.

Okay, that would makes sense for consistency because we always
first set node->lock = 0 at the top of the function.

If we prefer to optimize this a bit though, perhaps we can
first move the node->lock = 0 so that it gets executed after the
"if (likely(prev == NULL)) {}" code block and then delete
"node->lock = 1" inside the code block.

static noinline
void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node)
{
       struct mcs_spin_node *prev;

       /* Init node */
       node->next   = NULL;

       prev = xchg(lock, node);
       if (likely(prev == NULL)) {
               /* Lock acquired */
               return;
       }
       node->locked = 0;
       ACCESS_ONCE(prev->next) = node;
       smp_wmb();
       /* Wait until the lock holder passes the lock down */
       while (!ACCESS_ONCE(node->locked))
               arch_mutex_cpu_relax();
}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>