Re: Optimisations and undefined behaviour

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/11/15 09:20, David Brown wrote:
> On 09/11/15 17:31, Andrew Haley wrote:
>> On 11/09/2015 03:56 PM, David Brown wrote:
>>
>>> We typically cannot use "sanatize" options, nor can we accept that a
>>> bug in one part of the program causes undue and unnecessarily
>>> damaging side-effects in other parts.
>>
>> Well, you have to get used to that.  It is reality: computers work
>> that way.  I'm sure you know that if you hit the wrong I/O port
>> with a wild write odd things will happen.  Whether that's "undue" or
>> "unnecessary" I couldn't say: it just is.
> 
> There is no doubt that with C, it is easy to make mistakes that can lead
> to disaster.  One little typo, and you access an array outside of its
> bounds - with the result of corrupting a completely independent piece of
> data, showing up as a problem in a different part of the program that
> was already tested and working fine.  These things are part of the risk
> of C programming - it's the cost you pay for maximum speed and efficiency.
> 
> But let's look back at Annex L in the C11 standards, and ask what it is
> /for/.  It is labelled "Analyzability" - the aim is to help developers
> (and code reviewers, and static analysers) figure out if code is correct
> or not.  Part of that is to place bounds on the effects of some types of
> undefined behaviour.  Now, you can certainly argue that Annex L doesn't
> go far enough to be useful, or it is too vague, or that it is
> impractical or limiting to implement, or that knock-on effects can
> quickly turn bounded undefined behaviour into unbounded undefined
> behaviour.  But the very existence of this Annex in the C standards
> shows how important these things are.
> 
> To my mind, if execution hits UB, then it's a bug in the program.  It
> doesn't really matter if the bug is UB, or an error in the algorithm, or
> a coding mistake, or even a mistake in the customer's requirement
> specifications.  It's a point where the computer does not do what the
> user expects it to do.  Is it so unreasonable to want code that is
> correct up to that bug, to run correctly until the bug is hit?  Is it
> unreasonable to want the compiler to warn if it sees such a bug, and
> uses it to change the past action of the code?
> 
> I know that once code hits the bug, all sorts of things can go wrong.
> All I ask is that the compiler should not make things worse - at least
> not without informing the developer.
> 
> 
>>
>> C definitely works that way.  Maybe there should be a nice small
>> language which is useful for embedded developers and doesn't have
>> all the interesting UB properties that C has.  (Ada, maybe?  Probably
>> not.)  Maybe you could define a language compatible with C with the UB
>> removed.  But defining the semantics of such a language would not be
>> easy.  And I don't think it makes much sense to change GCC without
>> such a rigorous language definition.
>>
> 
> I don't believe it is possible to make a general and efficient
> programming language without UB.  And even if it could be eliminated at
> the language level, it would still exist at a higher level - a square
> root function could well have undefined behaviour for negative inputs.
> I also am happy that gcc can make many small optimisations based on its
> knowledge of UB - strength reduction and constant folding in integer
> expressions is a key example.
> 
> But I can also see the potential for optimisations based on UB to make
> developers' lives much more difficult by working logically backwards in
> time.
> 

You have to remember that a lot (but by no means all) of UB in C comes
from the desire to have an efficient language that is portable to many
architectures.  C has to support 'int' of unspecified maximum size,
machines that have two's complement, one's complement and sign-magnitude
arithmetic, multiple different floating point formats.  The way the
language committee has enabled such support without requiring the tools
to produce grossly inefficient code is to say that if you step outside
the limits of the machine you are running on, then the behaviour is
undefined.  However, it didn't specifically qualify what had to happen
in those cases and didn't distinguish them from other cases where UB can
occur; it just said, if this happens your program is broken.
Furthermore, it said quite clearly that compilers do not HAVE to detect
such brokenness - because in the extreme cases that would be very
expensive or even impossible to prove.

(I would add at this point, that it is also very hard at times to prove
something would invoke UB, so the ability to emit a diagnostic could
depend on previous optimizations done on the code before the check is
performed -- you only have to look at the history of
-Wmaybe-uninitialized to realize how hard it is to get things like this
right; and false positives can be as painful for developers as missing
warnings or warnings that seem to appear and disappear at random).

> (Note that in all my testing, I have not found gcc to perform any of
> these unwanted "optimisations" that I fear - but I haven't found good
> reason to be sure that they won't be performed in the future.)
> 

And I doubt we would ever make such a guarantee.  It might severely
hamper GCC's stated goal of being a production-quality optimizing compiler.

> 
> And yes, I realise that taken to extremes, this is ultimately an
> impossible task.  The point is merely to aim in that direction.  I don't
> expect gcc to warn on every possible error - but I am happy to see it
> warning on more and more possible errors.  What I don't want to see is
> is that since it is impossible to know /exactly/ what the programmer
> wanted when there is UB in the code, that means the compiler can refuse
> all responsibility for trying.  That would IMHO be a disservice to the
> user, making it harder to find and fix bugs.






[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux