Re: asm volatile statement reordering

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17/10/17 11:55, Xi Ruoyao wrote:
> On 2017-10-17 11:13 +0200, David Brown wrote:
>> On 17/10/17 09:58, Andrew Haley wrote:
>>> On 17/10/17 08:32, Jeffrey Walton wrote:
>>>> GCC guesses wrong on occasion. It will remove code that has value that
>>>> but does not produce an output because the language does not allow us
>>>> to express it.
>>>>
>>>> The C language lacks what we need to express what we need to do. Its a
>>>> failure of the C (and C++) committees. Its not a GCC failure.
>>
>> gcc is a tool that goes above and beyond C and C++.  It is not /just/ a
>> C and C++ compiler.  After all, the language standards don't have "asm"
>> at all.
>>
>> It is important to remember that gcc is used for many types of
>> programming - in particular, it is the most used tool for embedded
>> programming.  And the point of gcc is to be a useful tool for its users.
>>  So if users have trouble getting the ordering they need from gcc, then
>> it /is/ a gcc failure.  Sure, it would be nice if the C and C++
>> languages gave the features needed (and C11 and C++11 atomics and fences
>> go some way towards that), but it is up to gcc to handle things that are
>> clearly implementation-dependent behaviour, and compiler extensions.
>> And users use /gcc/ - they don't use a standards document.
>>
>> As gcc gets smarter with its optimisations, and doing more code
>> re-organisation, I see more and more problems in embedded code.  We need
>> guarantees about code order - either rules that we /know/ gcc will
>> follow, or additional features to fill the gaps.  gcc developers can't
>> just tell embedded users "You see that code you had that worked fine
>> with gcc 3 to gcc 7?  Well, it won't work now with gcc 8.  You should
>> have studied ISO 9899:201x section 5.1.2.3p6 better".
>>
>> Don't get me wrong here - I am not trying to demand features from gcc.
>> Nor am I suggesting that users (embedded developers or otherwise) should
>> not learn the details of the C language, or that they should not have to
>> read the gcc documentation.
> 
> Yes.  So we need a low-level routine library for embedded environment.
> Either reuse an existing one (for example avr-libc), or customize one for
> your platform.  If it's really necessary, write routines in pure assembly.

Libraries such as avr-libc are written with gcc, in C with gcc
extensions and possibly inline assembly.  Those developers suffer from
exactly the same issues.  And pure assembly does not help - it is the
interaction between assembly and C that is at issue here.

> 
> We can't forbid GCC to perform new optimizations.  The users who don't
> care about embedded developing would be angry.

I don't want to forbid new optimisations in gcc - I am very happy to see
steadily more optimisations, as are most embedded developers.  Better
optimisations means we embedded developers can do more with less
hardware - smaller code means cheaper chips, and faster code means lower
power.

What I want is to be sure that I can use these optimisations without
risking incorrect code.

> 
>> I am trying to establish exactly what gcc does, and what guarantees it
>> may or may not give.  Based on that, I can see if current solutions in
>> source code are strong enough, or if they need workarounds, and if I
>> should be filing an enhancement request.
>>
>>>
>>> I disagree.  If you want a bunch of asms to execute in a particular
>>> order, add a memory clobber or some dependencies.  It's not difficult
>>> once you have the understanding.  In this particular case, fixing it
>>> is trivial, and there are many ways to do it.
>>>
>>
>> I understand about dependencies.  (I also know about memory clobbers,
>> but these tend to be a fairly blunt instrument and can lead to wasted
>> cpu cycles, especially on RISC cpus with lots of registers.  When you
>> are dealing with things like critical sections with interrupts disabled,
>> you are often trying to minimise the cycle time.)  My solution to this
>> particular ordering problem is:
> 
> I performed a simple test on x86_64.  With -O GCC kept "status" in
> register.  The clobber just tells GCC there is a dependency and won't
> slow down the program.

The cost of memory clobbers only appears when you have data that could
have been kept in registers, but is forced out (with extra loads and/or
extra stores) due to the clobber.  You don't see it on simple tests, and
you get it much more often on devices with more registers.

> 
>>     uint32_t status;
>>     asm volatile ("mrs %0, PRIMASK" : "=r" (status) :: );
>>     asm volatile ("cpsid i" :: "" (status) :);
>>
>>     foo();
>>
>>     asm volatile ("msr PRIMASK, %0" :: "r" (status) : );
>>
>>
>> The dependency on "status" in the "cpsid i" assembly line ensures that
>> it must be ordered after the "save PRIMASK" line.  However, I know of no
>> clear way - other than a potentially costly memory clobber - of forcing
>> the ordering of the "cpsid i" assembly before foo(), and the "restore
>> PRIMASK" line after foo().
>>
>> But all my testing suggests that asm volatile statements with no outputs
>> are not moved past each other, or past other observable behaviour
>> (volatile memory accesses, external functions that might have such
>> accesses, etc.).
>>
>> If that suggestion is true, then the solution above is not only a
>> correct fix, but it also results in optimal code - while a memory
>> clobber is definitely sub-optimal.
>>
>>
>>
>>




[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux