The checksum_32 code was originally written to only handle 2-byte aligned buffers, but was later extended to support arbitrary alignment. However, the non-PPro variant doesn't apply the carry before jumping to the 2- or 4-byte aligned versions, which clear CF. This causes the new checksum_kunit test to fail, as it runs with a large number of different possible alignments and both with and without carries. For example: ./tools/testing/kunit/kunit.py run --arch i386 --kconfig_add CONFIG_M486=y checksum Gives: KTAP version 1 # Subtest: checksum 1..3 ok 1 test_csum_fixed_random_inputs # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:267 Expected result == expec, but result == 65281 (0xff01) expec == 65280 (0xff00) not ok 2 test_csum_all_carry_inputs # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:314 Expected result == expec, but result == 65535 (0xffff) expec == 65534 (0xfffe) not ok 3 test_csum_no_carry_inputs With this patch, it passes. KTAP version 1 # Subtest: checksum 1..3 ok 1 test_csum_fixed_random_inputs ok 2 test_csum_all_carry_inputs ok 3 test_csum_no_carry_inputs I also tested it on a real 486DX2, with the same results. Signed-off-by: David Gow <davidgow@xxxxxxxxxx> --- This is a follow-up to the UML patch to use the common 32-bit x86 checksum implementations: https://lore.kernel.org/linux-um/20230704083022.692368-2-davidgow@xxxxxxxxxx/ --- arch/x86/lib/checksum_32.S | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S index 23318c338db0..128287cea42d 100644 --- a/arch/x86/lib/checksum_32.S +++ b/arch/x86/lib/checksum_32.S @@ -62,6 +62,7 @@ SYM_FUNC_START(csum_partial) jl 8f movzbl (%esi), %ebx adcl %ebx, %eax + adcl $0, %eax roll $8, %eax inc %esi testl $2, %esi -- 2.41.0.255.g8b1d071c50-goog