On 10/16/2017 11:45 AM, Martin Sebor wrote: > On 10/16/2017 06:28 AM, Mason wrote: >> Hello, >> >> I do note that the documentation for -Wformat-truncation states: >> "While enabling optimization will in most cases improve the accuracy >> of the warning, it may also result in false positives." >> >> >> However, consider the following test cases: > > The implementation of the warning uses the get_range_info() function > to determine the range of sprintf arguments and its ability to avoid > false positives and negatives is subject to the accuracy of the range > information made available by this API. (See the output of > -fdump-tree-printf-return-value for the ranges the warning pass > sees, and the bugs in Bugzilla tracking these issues.) This is > a general problem that affects all clients of the get_range_info() > function. > > It is possible to do better, for instance by avoiding the API and > determining the range of each argument by traversing the control > flow graph. Each affected warning (and optimization) pass could > do this itself but since Andrew MacLeod and Aldy Hernandez are > working on a general API to do just that it seems like making use > of their feature when it's available and avoiding the duplication > of effort and code is a better way to go. > > As an aside, another option to improve the accuracy would be to > integrate all these passes into VRP where the range info is fully > accurate. But form a design standpoint this doesn't seem like as > good an approach to me because it would increase coupling and > bloat the VRP pass with even more logic. In addition, warnings > like -Wformat-truncation would also benefit from better string > length information and so the same arguments could be made both > for and against integrating them into the tree-ssa-strlen pass > (though I suspect the string length integration will at some > point be necessary, unless some general purpose > get_string_length_range()-like API is developed first). > >> $ cat testcase.c >> #include <stdio.h> >> typedef unsigned long long u64; >> void foo(u64 arg, char *buf) >> { >> u64 u = arg / 1000000000; >> if (u >= 1000) return; >> snprintf(buf, 16, "%llu.XXXZZZ", u); >> } >> >> $ gcc-7 -O3 -Wall -S testcase.c >> testcase.c: In function 'foo': >> testcase.c:7:20: warning: '.XXXZZZ' directive output may be truncated >> writing 7 bytes into a region of size between 5 and 15 >> [-Wformat-truncation=] >> snprintf(buf, 16, "%llu.XXXZZZ", u); >> ^~~~~~~~~~~~~ >> In file included from /usr/include/stdio.h:937:0, >> from testcase.c:1: >> /usr/include/x86_64-linux-gnu/bits/stdio2.h:64:10: note: >> '__builtin___snprintf_chk' output between 9 and 19 bytes into a >> destination of size 16 >> return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, >> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> __bos (__s), __fmt, __va_arg_pack ()); >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> >> >> Because we only call snprintf when u < 1000, it is obvious to a human >> reader that the output cannot overflow the buffer. > > Right. Unfortunately, it's not what the warning pass sees. > The same limitation, incidentally, also triggers false positives > in other warnings. For instance, with -Walloca-larger-than=999 > in this test case: > > void bar (void*); > > void foo(u64 arg) > { > u64 u = arg / 1000000000; > if (u >= 1000) return; > bar (__builtin_alloca (u)); > } > > we get: > > warning: argument to ‘alloca’ may be too large [-Walloca-larger-than=] > bar (__builtin_alloca (u)); > ^~~~~~~~~~~~~~~~~~~~~~~~~~ > note: limit is 1000 bytes, but argument may be as large as 18446744073 > >> Strangely, the following test case does not warn. >> >> $ cat testcase2.c >> #include <stdio.h> >> typedef unsigned long long u64; >> void foo(u64 arg, char *buf) >> { >> snprintf(buf, 16, "%llu.XXXZZZ", arg); >> } >> >> Here, a warning would be quite useful... But none is given. > > The difference between these two cases is that in the first one > the argument is known to be bounded by a range (unfortunately, > the range exposed by get_range_info() here appears to be less > than accurate), while in the second one the argument has no > range associated with it. The heuristic the warning uses is > that a variable constrained to any range triggers the checking > (on the basis that the programmer tries to reduce its range but > didn't do good enough of a job), while one that's not does not > (the constraint may be in a different function in another > translation unit). So following-up. If I convert the sprintf bits to a domwalk and embed an evrp style class to compute the range data. I get this at the key point: (gdb) p info.callstmt $18 = (gimple *) 0x7ffff01ca498 (gdb) p debug_gimple_stmt ($18) # .MEM_6 = VDEF <.MEM_4(D)> snprintf (buf_5(D), 16, "%llu.XXXZZZ", u_3); $19 = void (gdb) p dump_all_value_ranges (stderr) arg_2(D): [0, 999999999999] EQUIVALENCES: { arg_2(D) } (1 elements) u_3: [0, 999] Note that this range data is the form used within tree-vrp and is not attached to the SSA_NAME. So it's not visible to get_range_info. But we can clearly see that it's possible to get good tight range info for u at the call to sprintf. Jeff