memory barriers in flock (Re: [PATCH v3] locks: close potential race between setlease and open)

David Howells <dhowells@xxxxxxxxxx> · Thu, 15 Aug 2013 21:44:25 +0100

Bruce Fields <bfields@xxxxxxxxxxxx> wrote:

(Adding Paul McKenney who's good at this stuff)

> > v2:
> > - fix potential double-free of lease if second check finds conflict
> > - add smp_mb's to ensure that other CPUs see i_flock changes
> > 
> > v3:
> > - remove smp_mb calls. Partial ordering is unlikely to help here.
> 
> Forgive me here, I still don't understand.  So to simplify massively,
> the situation looks like:
>
> 	setlease		open
> 	------------		------
> 
> 	atomic_read		atomic_inc
> 	write i_flock		read i_flock
> 	atomic_read

Are the three atomic ops reading the same value?  If so, you can have smp_mb()
calls here:

	atomic_read		atomic_inc
				smp_mb()
	write i_flock		read i_flock
	smp_mb()
 	atomic_read

I *think* that memory accesses in one place need to be reverse-ordered wrt to
those in the other place, so:

	atomic_read		atomic_inc
	smp_mb()		smp_mb()
	write i_flock		read i_flock
 	atomic_read

doesn't achieve anything.

> And we want to be sure that either the setlease caller sees the result
> of the atomic_inc, or the opener sees the result of the write to
> i_flock.
> 
> As an example, suppose the above steps happen in the order:
> 
> 	atomic_read [A]
> 	write i_flock [B]
> 	atomic_read [C]
> 				atomic_inc [X]
> 				read i_flock [Y]

(I've labelled the operations for convenience)

> How do we know that the read of i_flock [Y] at the last step reflects the
> latest value of i_flock?  For example, couldn't the write still be stuck in
> first CPU's cache?

Putting in memory barriers doesn't help here.  If A, B and C are all performed
and committed to memory by the time X happens, then Y will see B, but C will
not see X.

The thing to remember is that smp_mb() is not a commit operation: it doesn't
cause a modification to be committed to memory.  The reason you use it is to
make sure the CPU actually does preceding memory ops - eg. makes the
modification in question - before it goes and does any following memory ops.

Linux requires the system to be cache-coherent, so if the write is actually
performed by a CPU then the result will be obtained by any other CPU, no
matter whether it's still lingering in the first CPU's caches or whether it's
been passed on.

-*-

However, I could be wrong.  Memory barriers are mind-bending to try and think
through, especially when it's the operations being ordered are R-W vs R-W
rather than having W-W on at least one side.

Hopefully Paul will be able to chime in

David
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html