On Mon, Feb 07, 2022 at 06:44:55PM +0200, Ari Sundholm wrote: > Hello, Al, > > On 2/7/22 16:58, Al Viro wrote: > > On Mon, Feb 07, 2022 at 02:07:11PM +0200, Ari Sundholm wrote: > > > The function generic_copy_file_checks() checks that the ends of the > > > input and output file ranges do not overflow. Unfortunately, there is > > > an issue with the check itself. > > > > > > Due to the integer promotion rules in C, the expressions > > > (pos_in + count) and (pos_out + count) have an unsigned type because > > > the count variable has the type uint64_t. Thus, in many cases where we > > > should detect signed integer overflow to have occurred (and thus one or > > > more of the ranges being invalid), the expressions will instead be > > > interpreted as large unsigned integers. This means the check is broken. > > > > I must be slow this morning, but... which values of pos_in and count are > > caught by your check, but not by the original? > > > > Thank you for your response and questions. > > Assuming an x86-64 target platform, please consider: > > loff_t pos_out = 0x7FFFFFFFFFFEFFFFLL; > and > uint64_t count = 65537; > > The type of the expression (pos_out + count) is a 64-bit unsigned type, by > C's integer promotion rules. Its value is 0x8000000000000000ULL, that is, > bit 63 is set. > > The comparison (pos_out + count) < pos_out, again due to C's integer > promotion rules, is unsigned. Thus, the comparison, in this case, is > equivalent to: > > 0x8000000000000000ULL < 0x7FFFFFFFFFFEFFFFULL, > > which is false. Please note that the LHS is not expressible as a positive > integer of type loff_t. With larger values for count, the problem should > become quite obvious, as some the offsets within the file would not be > expressible as positive integers of type loff_t. But I digress. As we can > see above, the overflow is missed. > > With the LHS explicitly cast to loff_t, the comparison is equivalent to: > > 0x8000000000000000LL < 0x7FFFFFFFFFFEFFFFLL, > > which is true, as the LHS is negative. > > This has also been verified in practice, and was detected when running tests > on special cases of the copy_file_range syscall on different filesystems. Er... I still don't see the problem here. If the destination filesystem explicitly allows offsets in excess of 2^63, what's the point in that -EOVERFLOW? And if it doesn't, you'll get count truncated by generic_write_check_limits(), down to the amount remaining until the fs limit... Same on the input side - if your source file is at least 2^63, what's the problem? And if not, you'll get count capped by file size - pos_in, right under that check... Which filesystems had been involved and what was the test?