Re: [RFC][PATCH 0/5] arch: atomic rework

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Fri, 21 Feb 2014, Paul E. McKenney wrote:

> > And with conservative I mean "everything is a source of a dependency, and 
> > hence can't be removed, reordered or otherwise fiddled with", and that 
> > includes code sequences where no atomic objects are anywhere in sight [1].
> > In the light of that the only realistic way (meaning to not have to 
> > disable optimization everywhere) to implement consume as currently 
> > specified is to map it to acquire.  At which point it becomes pointless.
> 
> No, only memory_order_consume loads and [[carries_dependency]]
> function arguments are sources of dependency chains.

I don't see [[carries_dependency]] in the C11 final draft (yeah, should 
get a real copy, I know, but let's assume it's the same language as the 
standard).  Therefore, yes, only consume loads are sources of 
dependencies.  The problem with the definition of the "carries a 
dependency" relation is not the sources, but rather where it stops.  
It's transitively closed over "value of evaluation A is used as operand in 
evaluation B", with very few exceptions as per 5.1.2.4#14.  Evaluations 
can contain function calls, so if there's _any_ chance that an operand of 
an evaluation might even indirectly use something resulting from a consume 
load then that evaluation must be compiled in a way to not break 
dependency chains.

I don't see a way to generally assume that e.g. the value of a function 
argument can impossibly result from a consume load, therefore the compiler 
must assume that all function arguments _can_ result from such loads, and 
so must disable all depchain breaking optimization (which are many).

> > [1] Simple example of what type of transformations would be disallowed:
> > 
> > int getzero (int i) { return i - i; }
> 
> This needs to be as follows:
> 
> [[carries_dependency]] int getzero(int i [[carries_dependency]])
> {
> 	return i - i;
> }
> 
> Otherwise dependencies won't get carried through it.

So, with the above do you agree that in absense of any other magic (see 
below) the compiler is not allowed to transform my initial getzero() 
(without the carries_dependency markers) implementation into "return 0;" 
because of the C11 rules for "carries-a-dependency"?

If so, do you then also agree that the specification of "carries a 
dependency" is somewhat, err, shall we say, overbroad?

> > depchains don't matter, could _then_ optmize it to zero.  But that's 
> > insane, especially considering that it's hard to detect if a given context 
> > doesn't care for depchains, after all the depchain relation is constructed 
> > exactly so that it bleeds into nearly everywhere.  So we would most of 
> > the time have to assume that the ultimate context will be depchain-aware 
> > and therefore disable many transformations.
> 
> Any function that does not contain a memory_order_consume load and that 
> doesn't have any arguments marked [[carries_dependency]] can be 
> optimized just as before.

And as such marker doesn't exist we must conservatively assume that it's 
on _all_ parameters, so I'll stand by my claim.

> > Then inlining getzero would merely add another "# j.dep = i.dep" 
> > relation, so depchains are still there but the value optimization can 
> > happen before inlining.  Having to do something like that I'd find 
> > disgusting, and rather rewrite consume into acquire :)  Or make the 
> > depchain relation somehow realistically implementable.
> 
> I was actually OK with arithmetic cancellation breaking the dependency 
> chains.  Others on the committee felt otherwise, and I figured that (1) 
> I wouldn't be writing that kind of function anyway and (2) they knew 
> more about writing compilers than I.  I would still be OK saying that 
> things like "i-i", "i*0", "i%1", "i&0", "i|~0" and so on just break the 
> dependency chain.

Exactly.  I can see the problem that people had with that, though.  There 
are very many ways to write conceiled zeros (or generally neutral elements 
of the function in question).  My getzero() function is one (it could e.g. 
be an assembler implementation).  The allowance to break dependency chains 
would have to apply to such cancellation as well, and so can't simply 
itemize all cases in which cancellation is allowed.  Rather it would have 
had to argue about something like "value dependency", ala "evaluation B 
depends on A, if there exist at least two different values A1 and A2 
(results from A), for which evaluation B (with otherwise same operands) 
yields different values B1 and B2".

Alas, it doesn't, except if you want to understand the term "the value of 
A is used as an operand of B" in that way.  Even then you'd still have the 
second case of the depchain definition, via intermediate not even atomic 
memory stores and loads to make two evaluations be ordered per 
carries-a-dependency.

And even that understanding of "is used" wouldn't be enough, because there 
are cases where the cancellation happens in steps, and where it interacts 
with the third clause (transitiveness):  Assume this:

  a = something()  // evaluation A
  b = 1 - a        // evaluation B
  c = a - 1 + b    // evaluation C

Now, clearly B depends on A.  Also C depends on B (because with otherwise 
same operands changing just B also changes C), because of transitiveness C 
then also depends on A.  But equally cleary C was just an elaborate way to 
write "0", and so depends on nothing.  The problem was of course that A 
and B weren't independent when determining the dependencies of C.  But 
allowing cancellation to break dependency chains would have to allow for 
these cases as well.

So, now, that leaves us basically with depchains forcing us to disable 
many useful transformation or finding some other magic.  One would be to 
just regard all consume loads as acquire loads and be done (and 
effectively remove the ill-advised "carries a dependency" relation from 
consideration).

You say downthread that it'd also be possible to just emit barriers before 
all function calls (I say "all" because the compiler will generally 
have applied some transformation that broke depchains if they existed).  
That seems to me to be a bigger hammer than just ignoring depchains and 
emit acquires instead of consumes (because the latter changes only exactly 
where atomics are used, the former seems to me to have unbounded effect).

So, am still missing something or is my understanding of the 
carries-a-dependency relation correct and my conclusions are merely too 
pessimistic?


Ciao,
Michael.
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux