RE: [PATCH v4 05/11] arm64: csum: Disable KASAN for do_csum()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Robin Murphy
> Sent: 22 April 2020 12:02
..
> Sure - I have a nagging feeling that it could still do better WRT
> pipelining the loads anyway, so I'm happy to come back and reconsider
> the local codegen later. It certainly doesn't deserve to stand in the
> way of cross-arch rework.

How fast does that loop actually run?
To my mind it seems to do a lot of operations on each 64bit value.
I'd have thought that a loop based on:
	sum64 = *ptr;
	sum64_high = *ptr++ >> 32;
and then fixing up the result would be faster.

The x86-64 code is also bad!
On intel cpu prior to haswell a simple:
	sum_64 += *ptr32++;
is faster than the current code.
(Although you can do a lot better even on ivy bridge.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)




[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux