Re: [RFC][PATCH] fix csum_and_copy_..._user() idiocy. Re: AW: [PATCH] amd64: Fix csum_partial_copy_generic()

Al Viro <viro@xxxxxxxxxxxxxxxxxx> · Tue, 24 Oct 2023 04:53:50 +0100

On Mon, Oct 23, 2023 at 02:44:13PM +0000, David Laight wrote:
> From: Al Viro
> > Sent: 22 October 2023 20:40
> ....
> > We need a way for csum_and_copy_{from,to}_user() to report faults.
> > The approach taken back in 2020 (avoid 0 as return value by starting
> > summing from ~0U, use 0 to report faults) had been broken; it does
> > yield the right value modulo 2^16-1, but the case when data is
> > entirely zero-filled is not handled right.  It almost works, since
> > for most of the codepaths we have a non-zero value added in
> > and there 0 is not different from anything divisible by 0xffff.
> > However, there are cases (ICMPv4 replies, for example) where we
> > are not guaranteed that.
> > 
> > In other words, we really need to have those primitives return 0
> > on filled-with-zeroes input.  So let's make them return a 64bit
> > value instead; we can do that cheaply (all supported architectures
> > do that via a couple of registers) and we can use that to report
> > faults without disturbing the 32bit csum.
> 
> Does the ICMPv4 sum need to be zero if all zeros but 0xffff
> if there are non-zero bytes in there?

No.  RTFRFC, please.  Or the discussion of the bug upthread, for that
matter.

> IIRC the original buggy case was fixed by returning 0xffff
> for the all-zero buffer.

YRIC.  For the benefit of those who can pass the Turing test better than
ChatGPT would:

Define a binary operation on [0..0xffff] by

	X PLUS Y = X + Y - ((X + Y > 0xffff) ? 0xffff : 0)

Properties:
	X PLUS Y \in [0..0xffff]
	X PLUS Y = 0 iff X = Y = 0
	X PLUS Y is congruent to X + Y modulo 0xffff
	X PLUS Y = Y PLUS X
	(X PLUS Y) PLUS Z = X PLUS (Y PLUS Z)
	X PLUS 0 = X
	For any non-zero X, X PLUS 0xffff = X
	X PLUS (0xffff ^ X) = 0xffff
	byteswap(X) PLUS byteswap(Y) = byteswap(X PLUS Y)
(hint: if X \in [0..0xffff], byteswap(X) is congruent to 256*X modulo 0xffff)

	If A0,...,An are all in range 0..0xffff, \sum Ak * 0x1000^k is
congruent to A0 PLUS A1 PLUS ... PLUS An modulo 0xffff.  That's pretty
much the same thing as the usual rule for checking if decimal number
is divisible by 9.

	That's the math behind the checksum calculations.  You look at the
data, append 0 if the length had been odd and sum the 16bit values up
using PLUS as addition.  Result will be a 16bit value that will be
	* congruent to that data taken as long integer modulo 0xffff
	* 0 if and only if the data consists entirely of zeroes.
Endianness does not matter - byteswap the entire thing and result will
be byteswapped.

	Note that since 0xffffffff is a multiple of 0xffff, we can
calculate the remainder modulo 0xffffffff (by doing similar addition
of 4-byte groups), then calculate the remainder of that modulo 0xffff;
that's almost always cheaper - N/4 32bit operations vs N/2 16bit ones.
For 64bit architecture we can do the same with 64bit operations, reducing
to 32bit value in the end.  That's what csum_and_copy_...() stuff
is doing - it's memcpy() (or copy_..._user()) combined with calculation
of some 32bit value that would have the right reminder modulo 0xffff.

	Requirement for ICMP is that this function of the payload should
be 0xffff.  So we zero the "checksum" field, calculate the sum and then
adjust that field so that the sum would become 0xffff.  I.e. if the
value of the function with that field zeroed is N, we want to store
0xffff ^ N into the damn field.

	That's where it had hit the fan - getting 0xffff instead of 0
on the "all zeroes" data ended up with "OK, store the 0xffff ^ 0xffff
in there, everything will be fine".  Which yields the payload with
actual sum being 0.

	Special treatment of icmp_reply() is doable, but nobody is
suggesting to check for "all zeroes" case - that would be insane and
pointless.  The minimal workaround papering over that particular case
would be to chech WTF have we stored in that field and always
replacing 0 with 0xffff.  Note that if our data is e.g.
	<40 zeroes> 0x7f 0xff 0x80 0x00
the old rule would (correctly) suggest storing two zeroes into the
checksum field, while that modification would yield two 0xff in
there.  Which is still fine - 0xffff PLUS 0x7fff PLUS 0x8000 =
0 PLUS 0x7fff PLUS 0x8000 = 0xffff, so both variants are acceptable.

	However, I'm not at all sure that icmp_reply() is really
the only place where we can get all-zero data possible to encounter.
Most of the places where want to calculate checksums are not going
to see all-zeroes data, but it would be a lot better not to have to
rely upon that.