On 09/18/2018 11:32 AM, Vakul Garg wrote: >> -----Original Message----- >> From: Daniel Borkmann <daniel@xxxxxxxxxxxxx> >> Sent: Tuesday, September 18, 2018 2:57 PM >> To: Vakul Garg <vakul.garg@xxxxxxx>; Stephen Rothwell >> <sfr@xxxxxxxxxxxxxxxx>; David Miller <davem@xxxxxxxxxxxxx>; >> Networking <netdev@xxxxxxxxxxxxxxx> >> Cc: Linux-Next Mailing List <linux-next@xxxxxxxxxxxxxxx>; Linux Kernel >> Mailing List <linux-kernel@xxxxxxxxxxxxxxx> >> Subject: Re: linux-next: manual merge of the net-next tree with the net tree >> >> On 09/18/2018 11:10 AM, Vakul Garg wrote: >>>> -----Original Message----- >>>> From: Daniel Borkmann <daniel@xxxxxxxxxxxxx> >>>> Sent: Tuesday, September 18, 2018 2:14 PM >>>> To: Stephen Rothwell <sfr@xxxxxxxxxxxxxxxx>; David Miller >>>> <davem@xxxxxxxxxxxxx>; Networking <netdev@xxxxxxxxxxxxxxx> >>>> Cc: Linux-Next Mailing List <linux-next@xxxxxxxxxxxxxxx>; Linux >>>> Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>; Vakul Garg >>>> <vakul.garg@xxxxxxx> >>>> Subject: Re: linux-next: manual merge of the net-next tree with the >>>> net tree >>>> >>>> On 09/18/2018 02:11 AM, Stephen Rothwell wrote: >>>>> Hi all, >>>>> >>>>> Today's linux-next merge of the net-next tree got a conflict in: >>>>> >>>>> tools/testing/selftests/net/tls.c >>>>> >>>>> between commit: >>>>> >>>>> 50c6b58a814d ("tls: fix currently broken MSG_PEEK behavior") >>>>> >>>>> from the net tree and commit: >>>>> >>>>> c2ad647c6442 ("selftests/tls: Add test for recv(PEEK) spanning >>>>> across multiple records") >>>>> >>>>> from the net-next tree. >>>>> >>>>> I fixed it up (see below) and can carry the fix as necessary. This >>>>> is now fixed as far as linux-next is concerned, but any non trivial >>>>> conflicts should be mentioned to your upstream maintainer when your >>>>> tree is submitted for merging. You may also want to consider >>>>> cooperating with the maintainer of the conflicting tree to minimise >>>>> any particularly complex conflicts. >>>> >>>> The test from 50c6b58a814d supersedes the one from c2ad647c6442 so >>>> the recv_peek_large_buf_mult_recs could be removed; latter was also >>>> not working correctly due to this bug. >>> >>> Why remove recv_peek_large_buf_mult_recs if its correct? >>> Why not the newly added one which achieves the same thing? >> >> Hmm, not quite, on net-next kernel, the recv_peek_large_buf_mult_recs fails >> every time I invoke the tls test suite: >> >> # ./tls >> [==========] Running 28 tests from 2 test cases. >> [ RUN ] tls.sendfile >> [ OK ] tls.sendfile >> [ RUN ] tls.send_then_sendfile >> [ OK ] tls.send_then_sendfile >> [ RUN ] tls.recv_max >> [ OK ] tls.recv_max >> [ RUN ] tls.recv_small >> [ OK ] tls.recv_small >> [ RUN ] tls.msg_more >> [ OK ] tls.msg_more >> [ RUN ] tls.sendmsg_single >> [ OK ] tls.sendmsg_single >> [ RUN ] tls.sendmsg_large >> [ OK ] tls.sendmsg_large >> [ RUN ] tls.sendmsg_multiple >> [ OK ] tls.sendmsg_multiple >> [ RUN ] tls.sendmsg_multiple_stress >> [ OK ] tls.sendmsg_multiple_stress >> [ RUN ] tls.splice_from_pipe >> [ OK ] tls.splice_from_pipe >> [ RUN ] tls.splice_from_pipe2 >> [ OK ] tls.splice_from_pipe2 >> [ RUN ] tls.send_and_splice >> [ OK ] tls.send_and_splice >> [ RUN ] tls.splice_to_pipe >> [ OK ] tls.splice_to_pipe >> [ RUN ] tls.recvmsg_single >> [ OK ] tls.recvmsg_single >> [ RUN ] tls.recvmsg_single_max >> [ OK ] tls.recvmsg_single_max >> [ RUN ] tls.recvmsg_multiple >> [ OK ] tls.recvmsg_multiple >> [ RUN ] tls.single_send_multiple_recv >> [ OK ] tls.single_send_multiple_recv >> [ RUN ] tls.multiple_send_single_recv >> [ OK ] tls.multiple_send_single_recv >> [ RUN ] tls.recv_partial >> [ OK ] tls.recv_partial >> [ RUN ] tls.recv_nonblock >> [ OK ] tls.recv_nonblock >> [ RUN ] tls.recv_peek >> [ OK ] tls.recv_peek >> [ RUN ] tls.recv_peek_multiple >> [ OK ] tls.recv_peek_multiple >> [ RUN ] tls.recv_peek_large_buf_mult_recs >> tls.c:524:tls.recv_peek_large_buf_mult_recs:Expected memcmp(test_str, >> buf, len) (18446744073709551595) == 0 (0) >> tls.recv_peek_large_buf_mult_recs: Test failed at step #8 >> [ FAIL ] tls.recv_peek_large_buf_mult_recs >> [ RUN ] tls.pollin >> [ OK ] tls.pollin >> [ RUN ] tls.poll_wait >> [ OK ] tls.poll_wait >> [ RUN ] tls.blocking >> [ OK ] tls.blocking >> [ RUN ] tls.nonblocking >> [ OK ] tls.nonblocking >> [ RUN ] tls.control_msg >> [ OK ] tls.control_msg >> [==========] 27 / 28 tests passed. >> [ FAILED ] >> >> Here's what the recvfrom() with MSG_PEEK sees: >> >> [pid 2602] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3 [pid 2602] >> socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4 [pid 2602] bind(4, >> {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = >> 0 >> [pid 2602] listen(4, 10) = 0 >> [pid 2602] getsockname(4, {sa_family=AF_INET, sin_port=htons(41483), >> sin_addr=inet_addr("0.0.0.0")}, [16]) = 0 [pid 2602] connect(3, >> {sa_family=AF_INET, sin_port=htons(41483), sin_addr=inet_addr("0.0.0.0")}, >> 16) = 0 [pid 2602] setsockopt(3, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) >> = 0 [pid 2602] setsockopt(3, 0x11a /* SOL_?? */, 1, >> "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >> 40) = 0 [pid 2602] accept(4, {sa_family=AF_INET, sin_port=htons(46290), >> sin_addr=inet_addr("127.0.0.1")}, [16]) = 5 [pid 2602] setsockopt(5, >> SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0 [pid 2602] setsockopt(5, >> 0x11a /* SOL_?? */, 2, >> "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., >> 40) = 0 >> [pid 2602] close(4) = 0 >> [pid 2602] sendto(3, "test_read_peek", 14, 0, NULL, 0) = 14 [pid 2602] >> sendto(3, "_mult_recs\0", 11, 0, NULL, 0) = 11 [pid 2602] recvfrom(5, >> "test_read_peektest_read_peektest"..., 64, MSG_PEEK, NULL, NULL) = 64 >> [pid 2602] write(2, "tls.c:526:tls.recv_peek_large_bu"..., >> 112tls.c:526:tls.recv_peek_large_buf_mult_recs:Expected memcmp(test_str, >> buf, len) (18446744073709551595) == 0 (0) >> ) = 112 >> [pid 2602] close(3) = 0 >> [pid 2602] close(5) = 0 >> [pid 2602] exit_group(8) = ? >> >> Reason for the "test_read_peektest_read_peektest[...]" is because >> MSG_PEEK cannot call tls_sw_advance_skb(), since the skb is sitting there >> that needs to be consumed for non-MSG_PEEK case, and only then we can >> advance it. > > I general, my plan was to modify the tls_sw_recvmsg() to trigger as many > decryption as possible as required by requested user space PEEK size. > This would have required creating a pending list of decrypted records in tls_tx context. Right, had been thinking the same though for a fix in -net it would have been way too intrusive, hence the 50c6b58a814d ("tls: fix currently broken MSG_PEEK behavior") to avoid looping the same record which is clearly a bug. Wondering if DaveW's original rationale was to avoid accumulating too many records in the kernel since we would need to unpause strparser and keep processing the deeper we peek. >> Could you elaborate on where you ever had this test succeeding? With nxp >> accelerator? > > I never had this test succeeding. I pointed the problem to Dave Watson sometime > back (found during code reading). > > To make sure that this bug does not slip out, I simply submitted a test case to keep > reminding ourselves that we need to fix it sometime. Ok, I think usually tests assert current kernel behavior to make sure any changes coming in don't accidentally break expectations from applications as opposed to future tests that still need fixing, but I guess I'm fine either way how to resolve the conflict; leaving it up to DaveM. Thanks for clarifying! Cheers, Daniel