On Tue, Dec 11, 2018 at 4:17 AM Paul E. McKenney <paulmck@xxxxxxxxxxxxx> wrote: > > On Tue, Dec 11, 2018 at 12:29:08AM +0900, Akira Yokosawa wrote: > > Hi Paul and Junchang, > > > > This is a continuation of the discussion on the definition of > > cmpxchg() in CodeSamples/api-pthreads/api-gcc.h. > > > > Current definition is as follows: > > > > #define cmpxchg(ptr, o, n) \ > > ({ \ > > typeof(*ptr) _____actual = (o); \ > > \ > > __atomic_compare_exchange_n(ptr, (void *)&_____actual, (n), 1, \ > > __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST) ? (o) : (o)+1; \ > > }) > > > > As was already pointed out by Paul, this definition has potential issue > > when an argument with side-effect is given to "o". Also, (o)+1 can overflow. > > > > As a matter of fact, there is another definition of cmpxchg() in > > CodeSamples/formal/litmus/api.h. Currently it is defined as follows: > > > > #define cmpxchg(x, o, n) ({ \ > > typeof(o) __old = (o); \ > > __atomic_compare_exchange_n((x), &__old, (n), 1, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); \ > > __old; \ > > }) > > > > Apparently, this definition is not good on PPC or ARM. > > > > Following is a simple litmus test of cmpxchg() (to be executed by litmus7): > > -------------------------------------------- > > C C-cmpxchg > > { > > } > > > > { > > #include "api.h" > > } > > > > P0(int *x) > > { > > int r1; > > > > r1 = cmpxchg(x, 1, 2); > > } > > > > P1(int *x) > > { > > int r1; > > > > r1 = cmpxchg(x, 0, 1); > > } > > > > locations [1:r1] > > exists (0:r1=1 /\ ~x=2) > > --------------------------------- > > > > Testing on PPC with the current api.h yields the following: > > --------------------------------- > > Test C-cmpxchg Allowed > > Histogram (4 states) > > 2789 :>0:r1=0; 1:r1=0; x=0; > > 50452053:>0:r1=0; 1:r1=0; x=1; > > 36003 *>0:r1=1; 1:r1=0; x=1; > > 49509155:>0:r1=1; 1:r1=0; x=2; > > Ok > > > > Witnesses > > Positive: 36003, Negative: 99963997 > > Condition exists (0:r1=1 /\ not (x=2)) is validated > > Hash=3f625d680407ff822fb56f1a80834291 > > Observation C-cmpxchg Sometimes 36003 99963997 > > Time C-cmpxchg 43.37 > > ---------------------------------- > > > > If we change the definition to suppress spurious failure in > > __atomic_compare_exchange_n(), i.e.: > > > > #define cmpxchg(x, o, n) ({ \ > > typeof(o) __old = (o); \ > > __atomic_compare_exchange_n((x), &__old, (n), 0, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); \ > > __old; \ > > }) > > > > , litmus7 yields the following: > > ---------------------------------- > > Test C-cmpxchg Allowed > > Histogram (2 states) > > 50766497:>0:r1=0; 1:r1=0; x=1; > > 49233503:>0:r1=1; 1:r1=0; x=2; > > No > > > > Witnesses > > Positive: 0, Negative: 100000000 > > Condition exists (0:r1=1 /\ not (x=2)) is NOT validated > > Hash=3f625d680407ff822fb56f1a80834291 > > Observation C-cmpxchg Never 0 100000000 > > Time C-cmpxchg 35.20 > > ---------------------------------- > > > > This matches what herd7 says (with the 2nd pair of {} removed): > > ---------------------------------- > > Test C-cmpxchg Allowed > > States 2 > > 0:r1=0; 1:r1=0; x=1; > > 0:r1=1; 1:r1=0; x=2; > > No > > Witnesses > > Positive: 0 Negative: 2 > > Condition exists (0:r1=1 /\ not (x=2)) > > Observation C-cmpxchg Never 0 2 > > Time C-cmpxchg 0.01 > > Hash=1720691890ea69e35615997626680d6f > > ---------------------------------- > > > > Also, klitmus7 on PPC says: > > ---------------------------------- > > Test C-cmpxchg Allowed > > Histogram (2 states) > > 2019624 :>0:r1=0; 1:r1=0; x=1; > > 1980376 :>0:r1=1; 1:r1=0; x=2; > > No > > > > Witnesses > > Positive: 0, Negative: 4000000 > > Condition exists (0:r1=1 /\ not (x=2)) is NOT validated > > Hash=1720691890ea69e35615997626680d6f > > Observation C-cmpxchg Never 0 4000000 > > Time C-cmpxchg 0.76 > > ---------------------------------- > > > > If we use the definition in api-gcc.h, litmus7 says: > > ---------------------------------- > > Test C-cmpxchg Allowed > > Histogram (3 states) > > 11388 :>0:r1=2; 1:r1=1; x=0; > > 50373528:>0:r1=2; 1:r1=0; x=1; > > 49615084:>0:r1=1; 1:r1=0; x=2; > > No > > > > Witnesses > > Positive: 0, Negative: 100000000 > > Condition exists (0:r1=1 /\ not (x=2)) is NOT validated > > Hash=3f625d680407ff822fb56f1a80834291 > > Observation C-cmpxchg Never 0 100000000 > > Time C-cmpxchg 52.06 > > ---------------------------------- > > The result is still "Never", but the statistics look quite different. > > There is no "0:r1=0" observed! > > > > So, at least for litmus test, cmpxchg() should be defined as follows: > > > > #define cmpxchg(x, o, n) ({ \ > > typeof(o) __old = (o); \ > > __atomic_compare_exchange_n((x), &__old, (n), 0, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED); \ > > __old; \ > > }) > > > > If this is the way to go, I think the definition in api-gcc.h should > > be the same way for consistency, even if it would cause a little > > performance degradation. Then there would be no worry about overflows. > > Makes sense to me! Longer term, it would be good to characterize > weak mode, but having things work is a very good thing. ;-) > > Thanx, Paul > Hi Akira, Thanks for the detailed analysis. It a very interesting topic. For correctness, I agree with you on adopting strong variation in __atomic_compare_exchange_n. But for long-term, the following is my thinking, please take a look. For the current implementation (the one in api-gcc.h), the issue of the side-effect in argument (o) could be removed by adding one another local variable. The following is one example (It diagrams the idea, though I didn't test the code yet. But I can test it this night.) #define cmpxchg(ptr, o, n) \ ({ \ typeof(*ptr) _____actual = (o); \ typeof(*ptr) __old = _____actual; \ __atomic_compare_exchange_n(ptr, (void *)&_____actual, (n), 1, \ __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST) ? __old : __old+1; \ }) The reason that I stick to enabling weak variation on PPC and ARM is that it gives us about 10% performance improvement when we ran count_lim_atomic on a PPC server. When I was trying to collect some performance data of stress tests, this 10% benefit really means a lot. For the overflow issue, maybe we can figure out some better ways. Please let me know what do you think. :-) Regards, --Junchang