Re: [PATCH v10] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Le 27/02/2024 à 00:48, Guenter Roeck a écrit :
> On 2/26/24 15:17, Charlie Jenkins wrote:
>> On Mon, Feb 26, 2024 at 10:33:56PM +0000, David Laight wrote:
>>> ...
>>>> I think you misunderstand. "NET_IP_ALIGN offset is what the kernel
>>>> defines to be supported" is a gross misinterpretation. It is not
>>>> "defined to be supported" at all. It is the _preferred_ alignment
>>>> nothing more, nothing less.
>>
>> This distinction is arbitrary in practice, but I am open to being proven
>> wrong if you have data to back up this statement. If the driver chooses
>> to not follow this, then the driver might not work. ARM defines the
>> NET_IP_ALIGN to be 2 to pad out the header to be on the supported
>> alignment. If the driver chooses to pad with one byte instead of 2
>> bytes, the driver may fail to work as the CPU may stall after the
>> misaligned access.
>>
>>>
>>> I'm sure I've seen code that would realign IP headers to a 4 byte
>>> boundary before processing them - but that might not have been in
>>> Linux.
>>>
>>> I'm also sure there are cpu which will fault double length misaligned
>>> memory transfers - which might be used to marginally speed up code.
>>> Assuming more than 4 byte alignment for the IP header is likely
>>> 'wishful thinking'.
>>>
>>> There is plenty of ethernet hardware that can only write frames
>>> to even boundaries and plenty of cpu that fault misaligned accesses.
>>> There are even cases of both on the same silicon die.
>>>
>>> You also pretty much never want a fault handler to fixup misaligned
>>> ethernet frames (or really anything else for that matter).
>>> It is always going to be better to check in the code itself.
>>>
>>> x86 has just made people 'sloppy' :-)
>>>
>>>     David
>>>
>>> -
>>> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, 
>>> MK1 1PT, UK
>>> Registration No: 1397386 (Wales)
>>>
>>
>> If somebody has a solution they deem to be better, I am happy to change
>> this test case. Otherwise, I would appreciate a maintainer resolving
>> this discussion and apply this fix.
>>
> Agreed.
> 
> I do have a couple of patches which add explicit unaligned tests as well as
> corner case tests (which are intended to trigger as many carry overflows
> as possible). Once I get those working reliably, I'll be happy to submit
> them as additional tests.
> 

The functions definitely have to work at least with and without VLAN, 
which means the alignment cannot be greater than 4 bytes. That's also 
the outcome of the discussion.

Therefore, we can easily fix the tests with for instance the following 
changes. For the IPv6 test I switched proto and csum to keep csum 
aligned. (In addition expected values need to be recalculated for the 
IPv6 case).

diff --git a/lib/checksum_kunit.c b/lib/checksum_kunit.c
index bf70850035c7..26b0dbc5b8fd 100644
--- a/lib/checksum_kunit.c
+++ b/lib/checksum_kunit.c
@@ -581,7 +581,7 @@ static void test_ip_fast_csum(struct kunit *test)
  	u16 expected;

  	for (int len = IPv4_MIN_WORDS; len < IPv4_MAX_WORDS; len++) {
-		for (int index = 0; index < NUM_IP_FAST_CSUM_TESTS; index++) {
+		for (int index = 0; index < NUM_IP_FAST_CSUM_TESTS; index += 4) {
  			csum_result = ip_fast_csum(random_buf + index, len);
  			expected =
  				expected_fast_csum[(len - IPv4_MIN_WORDS) *
@@ -603,12 +603,10 @@ static void test_csum_ipv6_magic(struct kunit *test)

  	const int daddr_offset = sizeof(struct in6_addr);
  	const int len_offset = sizeof(struct in6_addr) + sizeof(struct in6_addr);
-	const int proto_offset = sizeof(struct in6_addr) + sizeof(struct 
in6_addr) +
-			     sizeof(int);
-	const int csum_offset = sizeof(struct in6_addr) + sizeof(struct 
in6_addr) +
-			    sizeof(int) + sizeof(char);
+	const int csum_offset = len_offset + sizeof(int);
+	const int proto_offset = csum_offset + sizeof(int);

-	for (int i = 0; i < NUM_IPv6_TESTS; i++) {
+	for (int i = 0; i < NUM_IPv6_TESTS; i += 4) {
  		saddr = (const struct in6_addr *)(random_buf + i);
  		daddr = (const struct in6_addr *)(random_buf + i +
  						  daddr_offset);
---
We could do even better by taking into account 
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS and do +1 when it is selected and 
+4 when it is not selected.

Christophe




[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux