On 17/10/17 11:55, Xi Ruoyao wrote: > On 2017-10-17 11:13 +0200, David Brown wrote: >> On 17/10/17 09:58, Andrew Haley wrote: >>> On 17/10/17 08:32, Jeffrey Walton wrote: >>>> GCC guesses wrong on occasion. It will remove code that has value that >>>> but does not produce an output because the language does not allow us >>>> to express it. >>>> >>>> The C language lacks what we need to express what we need to do. Its a >>>> failure of the C (and C++) committees. Its not a GCC failure. >> >> gcc is a tool that goes above and beyond C and C++. It is not /just/ a >> C and C++ compiler. After all, the language standards don't have "asm" >> at all. >> >> It is important to remember that gcc is used for many types of >> programming - in particular, it is the most used tool for embedded >> programming. And the point of gcc is to be a useful tool for its users. >> So if users have trouble getting the ordering they need from gcc, then >> it /is/ a gcc failure. Sure, it would be nice if the C and C++ >> languages gave the features needed (and C11 and C++11 atomics and fences >> go some way towards that), but it is up to gcc to handle things that are >> clearly implementation-dependent behaviour, and compiler extensions. >> And users use /gcc/ - they don't use a standards document. >> >> As gcc gets smarter with its optimisations, and doing more code >> re-organisation, I see more and more problems in embedded code. We need >> guarantees about code order - either rules that we /know/ gcc will >> follow, or additional features to fill the gaps. gcc developers can't >> just tell embedded users "You see that code you had that worked fine >> with gcc 3 to gcc 7? Well, it won't work now with gcc 8. You should >> have studied ISO 9899:201x section 5.1.2.3p6 better". >> >> Don't get me wrong here - I am not trying to demand features from gcc. >> Nor am I suggesting that users (embedded developers or otherwise) should >> not learn the details of the C language, or that they should not have to >> read the gcc documentation. > > Yes. So we need a low-level routine library for embedded environment. > Either reuse an existing one (for example avr-libc), or customize one for > your platform. If it's really necessary, write routines in pure assembly. Libraries such as avr-libc are written with gcc, in C with gcc extensions and possibly inline assembly. Those developers suffer from exactly the same issues. And pure assembly does not help - it is the interaction between assembly and C that is at issue here. > > We can't forbid GCC to perform new optimizations. The users who don't > care about embedded developing would be angry. I don't want to forbid new optimisations in gcc - I am very happy to see steadily more optimisations, as are most embedded developers. Better optimisations means we embedded developers can do more with less hardware - smaller code means cheaper chips, and faster code means lower power. What I want is to be sure that I can use these optimisations without risking incorrect code. > >> I am trying to establish exactly what gcc does, and what guarantees it >> may or may not give. Based on that, I can see if current solutions in >> source code are strong enough, or if they need workarounds, and if I >> should be filing an enhancement request. >> >>> >>> I disagree. If you want a bunch of asms to execute in a particular >>> order, add a memory clobber or some dependencies. It's not difficult >>> once you have the understanding. In this particular case, fixing it >>> is trivial, and there are many ways to do it. >>> >> >> I understand about dependencies. (I also know about memory clobbers, >> but these tend to be a fairly blunt instrument and can lead to wasted >> cpu cycles, especially on RISC cpus with lots of registers. When you >> are dealing with things like critical sections with interrupts disabled, >> you are often trying to minimise the cycle time.) My solution to this >> particular ordering problem is: > > I performed a simple test on x86_64. With -O GCC kept "status" in > register. The clobber just tells GCC there is a dependency and won't > slow down the program. The cost of memory clobbers only appears when you have data that could have been kept in registers, but is forced out (with extra loads and/or extra stores) due to the clobber. You don't see it on simple tests, and you get it much more often on devices with more registers. > >> uint32_t status; >> asm volatile ("mrs %0, PRIMASK" : "=r" (status) :: ); >> asm volatile ("cpsid i" :: "" (status) :); >> >> foo(); >> >> asm volatile ("msr PRIMASK, %0" :: "r" (status) : ); >> >> >> The dependency on "status" in the "cpsid i" assembly line ensures that >> it must be ordered after the "save PRIMASK" line. However, I know of no >> clear way - other than a potentially costly memory clobber - of forcing >> the ordering of the "cpsid i" assembly before foo(), and the "restore >> PRIMASK" line after foo(). >> >> But all my testing suggests that asm volatile statements with no outputs >> are not moved past each other, or past other observable behaviour >> (volatile memory accesses, external functions that might have such >> accesses, etc.). >> >> If that suggestion is true, then the solution above is not only a >> correct fix, but it also results in optimal code - while a memory >> clobber is definitely sub-optimal. >> >> >> >>