Re: [RFC][PATCH 22/31] locking,tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Tue, 26 Apr 2016 17:28:44 +0200

On Mon, Apr 25, 2016 at 04:54:34PM -0400, Chris Metcalf wrote:
> On 4/22/2016 5:04 AM, Peter Zijlstra wrote:

> >  static inline int atomic_add_return(int i, atomic_t *v)
> >  {
> >  	int val;
> >  	smp_mb();  /* barrier for proper semantics */
> >  	val = __insn_fetchadd4((void *)&v->counter, i) + i;
> >  	barrier();  /* the "+ i" above will wait on memory */
> >+	/* XXX smp_mb() instead, as per cmpxchg() ? */
> >  	return val;
> >  }
> 
> The existing code is subtle but I'm pretty sure it's not a bug.
> 
> The tilegx architecture will take the "+ i" and generate an add instruction.
> The compiler barrier will make sure the add instruction happens before
> anything else that could touch memory, and the microarchitecture will make
> sure that the result of the atomic fetchadd has been returned to the core
> before any further instructions are issued.  (The memory architecture is
> lazy, but when you feed a load through an arithmetic operation, we block
> issuing any further instructions until the add's operands are available.)
> 
> This would not be an adequate memory barrier in general, since other loads
> or stores might still be in flight, even if the "val" operand had made it
> from memory to the core at this point.  However, we have issued no other
> stores or loads since the previous memory barrier, so we know that there
> can be no other loads or stores in flight, and thus the compiler barrier
> plus arithmetic op is equivalent to a memory barrier here.
> 
> In hindsight, perhaps a more substantial comment would have been helpful
> here.  Unless you see something missing in my analysis, I'll plan to go
> ahead and add a suitable comment here :-)
> 
> Otherwise, though just based on code inspection so far:
> 
> Acked-by: Chris Metcalf <cmetcalf@xxxxxxxxxxxx> [for tile]

Thanks!

Just to verify; the new fetch-op thingies _do_ indeed need the extra
smp_mb() as per my patch, because there is no trailing instruction
depending on the completion of the load?
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html