On Fri, Jan 20, 2023 at 02:12:07PM +0100, Alejandro Colomar wrote: > > Is the undefined behavior here a real world issue > > anywhere, or is this just a theoretical issue based on interpretation of the C > > standard? > > All implementations of sscanf(3) produce Undefined Behavior (UB), AFAIK. > How much you consider UB to be a real-world issue differs for each > programmer, but I tend to consider all UB to be as bad as nasal demons. I'm > not saying UB shouldn't exist, just that you shouldn't invoke it. And a > function that is used for scanning user input is one of those places where > you really want to avoid invoking UB. Well, according to the C standard, the behavior is undefined when the value scanned by an integer conversion specifier does not fit in the type. That's clearly a bug in the standard. Obviously, implementations need not implement that bug, because it is stupid -- much stupider than other cases of UB in the standard. It's not unreasonable to focus on what implementations actually do. In general, compilers optimize code assuming that undefined behavior never occurs. However, this specific type of undefined behavior is pretty obscure, and it is hard to think of any compiler optimization that would apply to it. I'm doubtful that any has been implemented. UBSAN does not detect this as undefined behavior either. (Tested with 'sscanf("99999999999999999999", "%d", &x' with gcc 12.2.0 and clang 14.0.6.) The remaining question, then, would be whether any actual sscanf() implementation would actually do something "bad" given one of these "undefined" inputs. I think the main pitfall would be that a naive implementation of %d and other *signed* integer conversion specifiers would execute a signed integer multiplication that overflows. That's a more well established undefined behavior. Though, some implementations make that behavior defined too. Unsigned specifiers such as %u should fare better, as unsigned integer overflow has defined behavior. I.e., it would be much harder to write an implementation of %u that invokes undefined behavior due to the value being too large. It might be fair to say that behavior here is de facto implementation-defined, despite the standard saying undefined... Anyway, if you do go the hard-line route of "undefined is undefined, so let's deprecate", you need to make it clear (a) who is doing the deprecation, (b) what the actual issue is, and (c) what the replacement is. - Eric