On 2017/05/13 21:20:23 -0700, Paul E. McKenney wrote: > On Sun, May 14, 2017 at 10:31:51AM +0900, Akira Yokosawa wrote: >> On 2017/05/14 9:58, Akira Yokosawa wrote: >>> On 2017/05/14 7:56, Paul E. McKenney wrote: >>>> On Sat, May 13, 2017 at 11:37:06PM +0900, Akira Yokosawa wrote: >>>>> On 2017/05/13 05:45:56 -0700, Paul E. McKenney wrote: >>>>>> On Sat, May 13, 2017 at 09:04:38PM +0900, Akira Yokosawa wrote: >>>>>>> Hi Jason & Paul, >>>>>>> >>>>>>> although this has already been applied, I have a comment. >>>>>>> >>>>>>> On 2017/05/11 23:03:41 +0800, Junchang Wang wrote: >>>>>>>> Signed-off-by: Junchang Wang <junchangwang@xxxxxxxxx> >>>>>>>> --- >>>>>>>> CodeSamples/count/count_stat_eventual.c | 8 ++++---- >>>>>>>> 1 file changed, 4 insertions(+), 4 deletions(-) >>>>>>>> >>>>>>>> diff --git a/CodeSamples/count/count_stat_eventual.c b/CodeSamples/count/count_stat_eventual.c >>>>>>>> index 059ab8b..cbde4aa 100644 >>>>>>>> --- a/CodeSamples/count/count_stat_eventual.c >>>>>>>> +++ b/CodeSamples/count/count_stat_eventual.c >>>>>>>> @@ -27,12 +27,12 @@ int stopflag; >>>>>>>> >>>>>>>> void inc_count(void) >>>>>>>> { >>>>>>>> - ACCESS_ONCE(__get_thread_var(counter))++; >>>>>>>> + READ_ONCE(__get_thread_var(counter))++; >>>>>>> >>>>>>> This is OK because READ_ONCE() is defined as the same as ACCESS_ONCE() >>>>>>> in CodeSamples. However, the definition in the current Linux kernel >>>>>>> would not permit this. >>>>>>> >>>>>>> A read-modify-write access would need both READ_ONCE() and WRITE_ONCE(). >>>>>>> However, since "counter" is thread local and updated only by its owner, >>>>>>> we don't need READ_ONCE() here. So: >>>>>>> >>>>>>> + WRITE_ONCE(__get_thread_var(counter), __get_thread_var(counter) + 1); >>>>>>> >>>>>>> should have been sufficient. So, I measured the performance after this change. Unfortunately, this change doubles the overhead of updater. By inspecting the generated code, it looks like GCC (v5.4 for x86_64 with -O3 option) can't figure out two "__get_thread_var(counter)"s are identical, and uses two registers as pointers to access it. __get_thread_var() seems complex enough to obfuscate the compiler. To avoid this performance regression, we can use a temporary pointer as a hint for optimization: void inc_count(void) { unsigned long *p_cnt = &__get_thread_var(counter); WRITE_ONCE(*p_cnt, *p_cnt + 1); } Another idea is to restore ACCESS_ONCE(). As long as the argument is properly aligned scalar type, ACCESS_ONCE() should be OK. ACCESS_ONCE(__get_thread_var(counter))++; will emit both load and store instructions, but in this short function, the compiler has no way to optimize away either access even if we don't use ACCESS_ONCE(). Thoughts? Thanks, Akira >>>>>>> >>>>>>> Problem with this change is that the line gets too wide when applied to >>>>>>> the code snippet in 2-column layout. >>>>>> >>>>>> Good point -- though renumbering the code is not all -that- hard. >>>>>> >>>>>> I clearly should have made a better READ_ONCE() that enforced the same >>>>>> constraints as does the Linux kernel, perhaps something like this: >>>>>> >>>>>> #define READ_ONCE(x) ({ ACCESS_ONCE(x) }) >>>>>> >>>>>> Thoughts? >>>>> >>>>> I assume you meant: >>>>> >>>>> #define READ_ONCE(x) ({ ACCESS_ONCE(x); }) >>>> >>>> Indeed! Good catch!!! >>>> >>>>> I'm afraid this still permits uses such as: >>>>> >>>>> READ_ONCE(y)++; >>>>> >>>>> Looks like we need a complex definition which resembles that of >>>>> include/linux/compiler.h. Hmm??? >>>> >>>> OK, how about this? >>>> >>>> #define READ_ONCE(x) ({ typeof(x) ___x = ACCESS_ONCE(x); ___x; }) >>> >>> Ah, this resulted in an error as expected! >>> >>> main.c: In function ‘inc_test’: >>> main.c:11:18: error: lvalue required as increment operand >>> READ_ONCE(*p)++; >>> >> >> But the definition above will conflict with the argument "___x": >> >> y = READ_ONCE(___x); >> >> This won't work as expected. >> For CodeSamples, I guess we can safely reserve the name "___x". >> >> Thoughts? > > It is already reserved. User programs are not supposed to start > variables with "__". Which means that the "___x" name is of dubious > legality as far as the C standard is concerned, but so it goes. > If the compiler starts complaining about it, it can be changed. ;-) > > Thanx, Paul > >> Thanks, Akira >> >>> Glad to know we can avoid the complexity of the kernel. >>> >>> Thanks, Akira >>> >>>> >>>>> And I have another pending question regarding 2/2 of this patch set. >>>>> That might result in other addition of line to the code. I think >>>>> I can send it tomorrow. >>>> >>>> Looking forward to seeing it! ;-) >>>> >>>> Thanx, Paul >>>> >>>>>>> Hmm... >>>>>>> >>>>>>> Thanks, Akira >>>>>>> >>>>>>>> } >>>>>>>> >>>>>>>> unsigned long read_count(void) >>>>>>>> { >>>>>>>> - return ACCESS_ONCE(global_count); >>>>>>>> + return READ_ONCE(global_count); >>>>>>>> } >>>>>>>> >>>>>>>> void *eventual(void *arg) >>>>>>>> @@ -43,8 +43,8 @@ void *eventual(void *arg) >>>>>>>> while (stopflag < 3) { >>>>>>>> sum = 0; >>>>>>>> for_each_thread(t) >>>>>>>> - sum += ACCESS_ONCE(per_thread(counter, t)); >>>>>>>> - ACCESS_ONCE(global_count) = sum; >>>>>>>> + sum += READ_ONCE(per_thread(counter, t)); >>>>>>>> + WRITE_ONCE(global_count, sum); >>>>>>>> poll(NULL, 0, 1); >>>>>>>> if (stopflag) { >>>>>>>> smp_mb(); >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> . >>>> >>> >>> >> > > -- To unsubscribe from this list: send the line "unsubscribe perfbook" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html