Re: synchronize with a non-atomic flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2017-10-07 19:40 GMT+08:00 Akira Yokosawa <akiyks@xxxxxxxxx>:
> On 2017/10/07 15:04:50 +0800, Yubin Ruan wrote:
>> Thanks Paul and Akira,
>>
>> 2017-10-07 3:12 GMT+08:00 Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>:
>>> On Fri, Oct 06, 2017 at 08:35:00PM +0800, Yubin Ruan wrote:
>>>> 2017-10-06 20:03 GMT+08:00 Akira Yokosawa <akiyks@xxxxxxxxx>:
>>>>> On 2017/10/06 14:52, Yubin Ruan wrote:
>>>
>>> [ . . . ]
>>>
>>>>> I/O operations in printf() might make the situation trickier.
>>>>
>>>> printf(3) is claimed to be thread-safe, so I think this issue will not
>>>> concern us.
>>
>> so now I can pretty much confirm this.
>
> Yes. Now I recognize that POSIX.1c requires stdio functions to be MT-safe.
> By MT-safe, one call to printf() won't be disturbed by other racy function
> calls involving output to stdout.
>
> I was disturbed by the following description of MT-Safe in attributes(7)
> man page:
>
>     Being MT-Safe does not imply a function is atomic, nor  that  it
>     uses  any of the memory synchronization mechanisms POSIX exposes
>     to users. [...]
>
> Excerpt from a white paper at http://www.unix.org/whitepapers/reentrant.html:
>
>     The POSIX.1 and C-language functions that operate on character streams
>     (represented by pointers to objects of type FILE) are required by POSIX.1c
>     to be implemented in such a way that reentrancy is achieved (see ISO/IEC
>     9945:1-1996, §8.2). This requirement has a drawback; it imposes
>     substantial performance penalties because of the synchronization that
>     must be built into the implementations of the functions for the sake of
>     reentrancy. [...]
>
> Yubin, thank you for giving me the chance to realize this.
>
>>
>>>>> In a more realistic case where you do something meaningful in
>>>>> do_something() in both threads:
>>>>>
>>>>>     //process 1
>>>>>     while(1) {
>>>>>         if(READ_ONCE(flag) == 0) {
>>>>>             do_something();
>>>>>             WRITE_ONCE(flag, 1); // let another process to run
>>>>>         } else {
>>>>>             continue;
>>>>>         }
>>>>>     }
>>>>>
>>>>>     //process 2
>>>>>     while(1) {
>>>>>         if(READ_ONCE(flag) == 1) {
>>>>>             do_something();
>>>>>             WRITE_ONCE(flag, 0); // let another process to run
>>>>>         } else {
>>>>>             continue;
>>>>>         }
>>>>>     }
>>>
>>> In the Linux kernel, there is control-dependency ordering between
>>> the READ_ONCE(flag) and any stores in either the then-clause or
>>> the else-clause.  However, I see no ordering between do_something()
>>> and the WRITE_ONCE().
>>
>> I was not aware of the "control-dependency" ordering issue in the
>> Linux kernel before. Is it true for all architectures?
>>
>> But anyway, the ordering between READ_ONCE(flag) and any subsequent
>> stores are guaranteed on X86/X64, so we didn't need any memory barrier
>> here.
>>
>>>>> and if do_something() uses some shared variables other than "flag",
>>>>> you need a couple of memory barriers to ensure the ordering of
>>>>> READ_ONCE(), do_something(), and WRITE_ONCE() something like:
>>>>>
>>>>>     //process 1
>>>>>     while(1) {
>>>>>         if(READ_ONCE(flag) == 0) {
>>>>>             smp_rmb();
>>>>>             do_something();
>>>>>             smp_wmb();
>>>>>             WRITE_ONCE(flag, 1); // let another process to run
>>>>>         } else {
>>>>>             continue;
>>>>>         }
>>>>>     }
>>>>>
>>>>>     //process 2
>>>>>     while(1) {
>>>>>         if(READ_ONCE(flag) == 1) {
>>>>>             smp_rmb();
>>>>>             do_something();
>>>>>             smp_wmb();
>>>>>             WRITE_ONCE(flag, 0); // let another process to run
>>>>>         } else {
>>>>>             continue;
>>>>>         }
>>>>>     }
>>>
>>> Here, the control dependency again orders the READ_ONCE() against later
>>> stores, and the smp_rmb() orders the READ_ONCE() against any later
>>> loads.
>>
>> Understand and agree.
>>
>>> The smp_wmb() orders do_something()'s writes (but not its reads!)
>>> against the WRITE_ONCE().
>>
>> Understand and agree. But do we really need the smp_rmb() on X86/64?
>> As far as I know, on X86/64 stores are not reordered with other
>> stores...[1]
>>
>>>>> In Linux kernel memory model, you can use acquire/release APIs instead:
>>>>>
>>>>>     //process 1
>>>>>     while(1) {
>>>>>         if(smp_load_acquire(&flag) == 0) {
>>>>>             do_something();
>>>>>             smp_store_release(&flag, 1); // let another process to run
>>>>>         } else {
>>>>>             continue;
>>>>>         }
>>>>>     }
>>>>>
>>>>>     //process 2
>>>>>     while(1) {
>>>>>         if(smp_load_acquire(&flag) == 1) {
>>>>>             do_something();
>>>>>             smp_store_release(&flag, 0); // let another process to run
>>>>>         } else {
>>>>>             continue;
>>>>>         }
>>>>>     }
>>>
>>> This is probably the most straightforward of the above approaches.
>>>
>>> That said, if you really want a series of things to execute in a
>>> particular order, why not just put them into the same process?
>>
>> I will be very happy if I can. But sometimes we just have to deal with
>> issues concerning multiple processes...
>>
>> [1]: One thing I got a little confused is that some people claim that
>> on x86/64 there are several guarantees[2]:
>>     1) Loads are not reordered with other loads.
>>     2) Stores are not reordered with other stores.
>>     3) Stores are not reordered with older loads.
>> (note that Loads may still be reordered with older stores to different
>> locations)
>>
>> So, if 1) and 2) are true, why do we have "lfence" and "sfence"
>> instructions at all?
>
> Excerpt from Intel 64 and IA-32 Architectures Developer's Manual: Vol. 3A
> Section 8.2.5
>
>     [...] Despite the fact that Pentium 4, Intel Xeon, and P6 family
>     processors support processor ordering, Intel does not guarantee
>     that future processors will support this model. To make software
>     portable to future processors, it is recommended that operating systems
>     provide critical region and resource control constructs and API's
>     (application program interfaces) based on I/O, locking, and/or
>     serializing instructions be used to synchronize access to shared
>     areas of memory in multiple-processor systems. [...]
>
> So the answer seems "to make software portable to future processors".

Hmm...so currently these instructions are nops effectively?

Yubin

>
>>
>> [2]: I found those claims here, but not so sure whether or not they
>> are true: https://bartoszmilewski.com/2008/11/05/who-ordered-memory-fences-on-an-x86/
>>
>
--
To unsubscribe from this list: send the line "unsubscribe perfbook" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux