2017-10-06 20:03 GMT+08:00 Akira Yokosawa <akiyks@xxxxxxxxx>: > Hi Yubin, > > On 2017/10/06 14:52, Yubin Ruan wrote: >> Hi, >> I saw lots of discussions on the web about possible race when doing >> synchronization between multiple threads/processes with lock or atomic >> operations[1][2]. From my point of view most them are over-worrying. >> But I want to point out some particular issue here to see whether >> anyone have anything to say. >> >> Imagine two processes communicate using only a uint32_t variable in >> shared memory, like this: >> >> // uint32_t variable in shared memory >> uint32_t flag = 0; >> >> //process 1 >> while(1) { >> if(READ_ONCE(flag) == 0) { >> do_something(); >> WRITE_ONCE(flag, 1); // let another process to run >> } else { >> continue; >> } >> } >> >> //process 2 >> while(1) { >> if(READ_ONCE(flag) == 1) { >> printf("process 2 running...\n"); >> WRITE_ONCE(flag, 0); // let another process to run >> } else { >> continue; >> } >> } >> >> On X86 or X64, I expect this code to run correctly, that is, I will >> got the two `printf' to printf one after one. > > Well, I see only one printf() above. > Do you mean: yes. sorry about the typo. > //process 1 > while(1) { > if(READ_ONCE(flag) == 0) { > printf("process 1 running...\n"); > WRITE_ONCE(flag, 1); // let another process to run > } else { > continue; > } > } > > //process 2 > while(1) { > if(READ_ONCE(flag) == 1) { > printf("process 2 running...\n"); > WRITE_ONCE(flag, 0); // let another process to run > } else { > continue; > } > } > > ? > > Then printf()s can be a problem. > It partially negates your claim 3). > Without using memory barrier, there is no guarantee that the results of > WRITE_ONCE() are visible to the other thread after the printf()'s > memory accesses complete. But, on X86/X64, where we have cache coherence, the result of WRITE_ONCE() should be visible to other thread (maybe not immediately, but eventually it will be visible). > I/O operations in printf() might make the situation trickier. printf(3) is claimed to be thread-safe, so I think this issue will not concern us. > In a more realistic case where you do something meaningful in > do_something() in both threads: > > //process 1 > while(1) { > if(READ_ONCE(flag) == 0) { > do_something(); > WRITE_ONCE(flag, 1); // let another process to run > } else { > continue; > } > } > > //process 2 > while(1) { > if(READ_ONCE(flag) == 1) { > do_something(); > WRITE_ONCE(flag, 0); // let another process to run > } else { > continue; > } > } > > and if do_something() uses some shared variables other than "flag", > you need a couple of memory barriers to ensure the ordering of > READ_ONCE(), do_something(), and WRITE_ONCE() something like: > > //process 1 > while(1) { > if(READ_ONCE(flag) == 0) { > smp_rmb(); > do_something(); > smp_wmb(); > WRITE_ONCE(flag, 1); // let another process to run > } else { > continue; > } > } > > //process 2 > while(1) { > if(READ_ONCE(flag) == 1) { > smp_rmb(); > do_something(); > smp_wmb(); > WRITE_ONCE(flag, 0); // let another process to run > } else { > continue; > } > } > > In Linux kernel memory model, you can use acquire/release APIs instead: > > //process 1 > while(1) { > if(smp_load_acquire(&flag) == 0) { > do_something(); > smp_store_release(&flag, 1); // let another process to run > } else { > continue; > } > } > > //process 2 > while(1) { > if(smp_load_acquire(&flag) == 1) { > do_something(); > smp_store_release(&flag, 0); // let another process to run > } else { > continue; > } > } Yes it could be tricky when `do_something()' really do something that involved other shared variable. Yubin > The intention of the code is easier to see when you use well-defined APIs. > Just my two cents. > > Thanks, Akira > >> That is because: >> >> 1) on X86/X64, load/store on 32-bits variable are atomic >> 2) I use READ_ONCE/WRITE_ONCE to prevent possibly harmful compiler >> optimization on `flag'. >> 3) I use only one variable to communicate between two processes, >> so there is no need for any kind of barrier. >> >> Does anyone have any objection at that? >> >> I know using a lock or atomic operation will save me a lot of >> argument, but I think those things are unnecessary at this >> circumstance, and it matter where performance matter, so I am picky >> here... >> >> Yubin >> >> [1]: https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong >> [2]: https://www.usenix.org/conference/osdi10/ad-hoc-synchronization-considered-harmful >> -- >> To unsubscribe from this list: send the line "unsubscribe perfbook" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- To unsubscribe from this list: send the line "unsubscribe perfbook" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html