Re: [RFC][PATCH 0/5] arch: atomic rework

Torvald Riegel <triegel@xxxxxxxxxx> · Fri, 07 Feb 2014 00:25:17 +0100

On Thu, 2014-02-06 at 22:13 +0000, Joseph S. Myers wrote:
> On Thu, 6 Feb 2014, Torvald Riegel wrote:
> 
> > > It seems that nobody really
> > > agrees on exactly how the C11 atomics map to real architectural
> > > instructions on anything but the trivial architectures.
> > 
> > There's certainly different ways to implement the memory model and those
> > have to be specified elsewhere, but I don't see how this differs much
> > from other things specified in the ABI(s) for each architecture.
> 
> It is not clear to me that there is any good consensus understanding of 
> how to specify this in an ABI, or how, given an ABI, to determine whether 
> an implementation is valid.
> 
> For ABIs not considering atomics / concurrency, it's well understood, for 
> example, that the ABI specifies observable conditions at function call 
> boundaries ("these arguments are in these registers", "the stack pointer 
> has this alignment", "on return from the function, the values of these 
> registers are unchanged from the values they had on entry").  It may 
> sometimes specify things at other times (e.g. presence or absence of a red 
> zone - whether memory beyond the stack pointer may be overwritten on an 
> interrupt).  But if it gives a code sequence, it's clear this is just an 
> example rather than a requirement for particular code to be generated - 
> any code sequence suffices if it meets the observable conditions at the 
> points where code generated by one implementation may pass control to code 
> generated by another implementation.
> 
> When atomics are involved, you no longer have a limited set of 
> well-defined points where control may pass from code generated by one 
> implementation to code generated by another - the code generated by the 
> two may be running concurrently.

Agreed.

> We know of certain cases 
> <http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html> where there are 
> choices of the mapping of atomic operations to particular instructions.  
> But I'm not sure there's much evidence that these are the only ABI issues 
> arising from concurrency - that there aren't any other ways in which an 
> implementation may transform code, consistent with the as-if rule of ISO 
> C, that may run into incompatibilities of different choices.

I can't think of other incompatibilities with high likelihood --
provided we ignore consume memory order and the handling of dependencies
(see below).  But I would doubt there is a high risk of such because (1)
the data race definition should hopefully not cause subtle
incompatibilities and (2) there is  clear "hand-off point" from the
compiler to a specific instruction representing an atomic access.  For
example, if we have a release store, and we agree on the instructions
used for that, then compilers will have to ensure happens-before for
anything before the release store; for example, as long as stores
sequenced-before the release store are performed, it doesn't matter in
which order that happens.  Subsequently, an acquire-load somewhere else
can pick this sequence of events up just by using the agreed-upon
acquire-load; like with the stores, it can order subsequent loads in any
way it sees fit, including different optimizations.  That's obviously
not a formal proof, though.  But it seems likely to me, at least :)

I'm more concerned about consume and dependencies because as far as I
understand the standard's requirements, dependencies need to be tracked
across function calls.  Thus, we might have several compilers involved
in that, and we can't just "condense" things to happens-before, but it's
instead how and that we keep dependencies intact.  Because of that, I'm
wondering whether this is actually implementable practically.  (See
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59448#c10)

> And even if 
> those are the only issues, it's not clear there are well-understood ways 
> to define the mapping from the C11 memory model to the architecture's 
> model, which provide a good way to reason about whether a particular 
> choice of instructions is valid according to the mapping.

I think that if we have different options, there needs to be agreement
on which to choose across the compilers, at the very least.  I don't
quite know how this looks like for GCC and LLVM, for example.

> > Are you familiar with the formalization of the C11/C++11 model by Batty
> > et al.?
> > http://www.cl.cam.ac.uk/~mjb220/popl085ap-sewell.pdf
> > http://www.cl.cam.ac.uk/~mjb220/n3132.pdf
> 
> These discuss, as well as the model itself, proving the validity of a 
> particular choice of x86 instructions.  I imagine that might be a starting 
> point towards an understanding of how to describe the relevant issues in 
> an ABI, and how to determine whether a choice of instructions is 
> consistent with what an ABI says.  But I don't get the impression this is 
> yet at the level where people not doing research in the area can routinely 
> read and write such ABIs and work out whether a code sequence is 
> consistent with them.

It's certainly complicated stuff (IMHO).  On the positive side, it's
just a few compilers, the kernel, etc. that have to deal with this, if
most programmers use the languages' memory model.

> (If an ABI says "use instruction X", then you can't use a more efficient 
> X' added by a new version of the instruction set.  But it can't 
> necessarily be as loose as saying "use X and Y, or other instructions that 
> achieve semantics when the other thread is using X or Y", because it might 
> be the case that Y' interoperates with X, X' interoperates with Y, but X' 
> and Y' don't interoperate with each other.  I'd envisage something more 
> like mapping not to instructions, but to concepts within the 
> architecture's own memory model - but that requires the architecture's 
> memory model to be as well defined, and probably formalized, as the C11 
> one.)

Yes.  The same group of researchers have also worked on formalizing the
Power model, and use this as base for a proof of the correctness of the
proposed mappings: http://www.cl.cam.ac.uk/~pes20/cppppc/

The formal approach to all this might be a complex task, but it is more
confidence-inspiring than making guesses about one standard's prose vs.
another specification's prose.

--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html