On 2018-07-12 15:38 -0400, Ignitus Boyone wrote: > I believe the definition of undefined behavior is simply. Not defined in the C/C++ specification. > > https://en.cppreference.com/w/cpp/language/ub > > This means that the implementation is responsible for diving what should be done instead of the spec. Often > undefined behavior is just where the road ends, because you shouldn’t do it in the first place. >From ISO/IEC 9899:1999 3.4.3 para 1: > *undefined behavior* > behavior, upon use of a nonportable or erroneous program construct > or of erroneous data, for which this International Standard imposes > no requirements There are "no requirements". Implementation can just assume that there are no undefined behaviors in a program. > I feel a major point of this thread is to erase the idea that undefined means unpredictable. No. Undefined behavior is *totally* unpredictable. > Writing the variable using ptr arithmetic is very predictable, but because when you go past the bounds you might > overwrite any number of things. No. It is *not* predictable. When you see such a snip of code: int bar(char *y); int foo(const char *y) { int i; char x[4] = {0, 0, 0, 0, }; for (i = 0; islower(y[i]); i++) x[i] = y[i]; return bar(x); } You may think "well, this is undefined if y is `abcde` because it will overwrite the memory beyond the array x". This is *wrong*. The reason is: the compiler may notice that if i >= 4, there will be an undefined behavior. So the compiler can assume i < 4. Then the compiler may decide unrolling the loop to optimize the program: if (!islower(y[0]) goto ret; x[0] = y[0]; if (!islower(y[1]) goto ret; x[1] = y[1]; if (!islower(y[2]) goto ret; x[2] = y[2]; if (!islower(y[3]) goto ret; x[3] = y[3]; ret: return bar(y); Note that this optimized program will *never* overwrite the memory beyond x. So you can't even predict if the overwrite will happen. If someone believe that "int y[10]; y[-1] = 42" is undefined because "it would overwrite the memory before y, and mess up the data", what would he do next? Well, he may use a linker script to assign the memory like: 0xffff8000 - 0xffff8027: int x[10]; 0xffff8028 - 0xffff8049: int y[10]; Then he will say "Well, I can use y[-10] to y[9] now because I will just overwrite array x doing that, no other side effect!" It's totally wrong. If you write something like if (a != 42) y[-5] = 1; else printf("%d\n", y[5]); printf("%d\n", a); Then the compiler could say "What? y[-5] = 1 is rediculous because it invokes undefined behavior. So I can assume a == 42." Then the entire code snip becomes: printf("%d\n", y[5]); puts("42"); So, y[-5] = 1 is undefined behavior, not because it will mess up the memory, but because *the standard say so and the compiler can assume it won't happen*. My example above is based on [1]. In [1] Steve Summit explained that we should *not* guess "how will an undefined behavior behave in practice". Undefined behavior is just unpredictable at all. In Mahmood's example he is trying to overflow a program's buffer. It definitely invokes undefined behavior. So he has to analyze the (maybe disassembled) object code of the program, *not" its C source code. We can't predict "what will happen on buffer overflow" by reading C code because GCC (and other C compilers) actually assumes there is no such thing. [1] http://www.eskimo.com/~scs/readings/undef.950321.html -- Xi Ruoyao <ryxi@xxxxxxxxxxxxxxxxx> School of Aerospace Science and Technology, Xidian University