On 10/16/2017 06:28 AM, Mason wrote:
Hello, I do note that the documentation for -Wformat-truncation states: "While enabling optimization will in most cases improve the accuracy of the warning, it may also result in false positives." However, consider the following test cases:
The implementation of the warning uses the get_range_info() function to determine the range of sprintf arguments and its ability to avoid false positives and negatives is subject to the accuracy of the range information made available by this API. (See the output of -fdump-tree-printf-return-value for the ranges the warning pass sees, and the bugs in Bugzilla tracking these issues.) This is a general problem that affects all clients of the get_range_info() function. It is possible to do better, for instance by avoiding the API and determining the range of each argument by traversing the control flow graph. Each affected warning (and optimization) pass could do this itself but since Andrew MacLeod and Aldy Hernandez are working on a general API to do just that it seems like making use of their feature when it's available and avoiding the duplication of effort and code is a better way to go. As an aside, another option to improve the accuracy would be to integrate all these passes into VRP where the range info is fully accurate. But form a design standpoint this doesn't seem like as good an approach to me because it would increase coupling and bloat the VRP pass with even more logic. In addition, warnings like -Wformat-truncation would also benefit from better string length information and so the same arguments could be made both for and against integrating them into the tree-ssa-strlen pass (though I suspect the string length integration will at some point be necessary, unless some general purpose get_string_length_range()-like API is developed first).
$ cat testcase.c #include <stdio.h> typedef unsigned long long u64; void foo(u64 arg, char *buf) { u64 u = arg / 1000000000; if (u >= 1000) return; snprintf(buf, 16, "%llu.XXXZZZ", u); } $ gcc-7 -O3 -Wall -S testcase.c testcase.c: In function 'foo': testcase.c:7:20: warning: '.XXXZZZ' directive output may be truncated writing 7 bytes into a region of size between 5 and 15 [-Wformat-truncation=] snprintf(buf, 16, "%llu.XXXZZZ", u); ^~~~~~~~~~~~~ In file included from /usr/include/stdio.h:937:0, from testcase.c:1: /usr/include/x86_64-linux-gnu/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' output between 9 and 19 bytes into a destination of size 16 return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ __bos (__s), __fmt, __va_arg_pack ()); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Because we only call snprintf when u < 1000, it is obvious to a human reader that the output cannot overflow the buffer.
Right. Unfortunately, it's not what the warning pass sees. The same limitation, incidentally, also triggers false positives in other warnings. For instance, with -Walloca-larger-than=999 in this test case: void bar (void*); void foo(u64 arg) { u64 u = arg / 1000000000; if (u >= 1000) return; bar (__builtin_alloca (u)); } we get: warning: argument to ‘alloca’ may be too large [-Walloca-larger-than=] bar (__builtin_alloca (u)); ^~~~~~~~~~~~~~~~~~~~~~~~~~ note: limit is 1000 bytes, but argument may be as large as 18446744073
Strangely, the following test case does not warn. $ cat testcase2.c #include <stdio.h> typedef unsigned long long u64; void foo(u64 arg, char *buf) { snprintf(buf, 16, "%llu.XXXZZZ", arg); } Here, a warning would be quite useful... But none is given.
The difference between these two cases is that in the first one the argument is known to be bounded by a range (unfortunately, the range exposed by get_range_info() here appears to be less than accurate), while in the second one the argument has no range associated with it. The heuristic the warning uses is that a variable constrained to any range triggers the checking (on the basis that the programmer tries to reduce its range but didn't do good enough of a job), while one that's not does not (the constraint may be in a different function in another translation unit).
And this tweak of testcase.c makes the warning go away: $ cat testcase3.c #include <stdio.h> typedef unsigned long long u64; void foo(u64 arg, char *buf) { if (arg >= 1000000000000ULL) return; u64 u = arg / 1000000000; snprintf(buf, 16, "%llu.XXXZZZ", u); }
Here the range information is accurate.
Maybe the bounds-tracking code has a few blind spots? NB: Computing 'u' *before* testing arg retriggers the warning.
The blind spots I know about are all either due to the poor range or other information (e.g., string lengths are only available for constants). Martin