Re: [Bug report] NFS patch breaks TLS device-offloaded TX zerocopy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 11/08/2024 21:33, Sagi Grimberg wrote:



On 11/08/2024 14:21, Tariq Toukan wrote:


On 06/08/2024 13:07, Tariq Toukan wrote:


On 06/08/2024 11:09, Sagi Grimberg wrote:



On 06/08/2024 7:43, Tariq Toukan wrote:


On 05/08/2024 14:43, Sagi Grimberg wrote:



On 05/08/2024 13:40, Tariq Toukan wrote:
Hi,

A recent patch [1] to 'fs' broke the TX TLS device-offloaded flow starting from v6.11-rc1.

The kernel crashes. Different runs result in different kernel traces.
See below [2].
All of them disappear once patch [1] is reverted.

The issues appears only with "sendfile on and zerocopy on".
We couldn't repro with "sendfile off", or with "sendfile on and zerocopy off".

The repro test is as simple as a repeated client/server communication (wrk/nginx), with sendfile on and zc on, and with "tls-hw-tx-offload: on".

$ for i in `seq 10`; do wrk -b::2:2:2:3 -t10 -c100 -d15 --timeout 5s https://[::2:2:2:2]:20448/16000b.img; done

We can provide more details if needed, to help with the analysis and debug.

Does tls sw (i.e. no offload) also break?


No it doesn't.
Only the "sendfile with ZC" flow of the TX device-offloaded TLS.


Adding Maxim Mikityanskiy, he might have some insights.

Not familiar with the TLS offload code, are there any assumptions on PAGE_SIZE contig buffers? Or assumptions on individual
page references/lifetime?

The sporadic panics you reported look like a result of memory corruption or use-after-free conditions.

You can find the original patch that implements it here:
c1318b39c7d3 tls: Add opt-in zerocopy mode of sendfile()

In this flow (sendfile + ZC), page is shared for kernel and userspace, and the extra copy is skipped.

There were a few code changes in this area since the feature was introduced. Adding relevant ppl, including David Howells <dhowells@xxxxxxxxxx>, who removed the sendpage() routine and added MSG_SPLICE_PAGES support to tls_device.

Tariq,

Can you explain where in your test is NFS used? Is the nginx server runs on an NFS mount?

I checked with the team.
The requested file, as well as the wrk and nginx apps, all reside on an NFS mount.




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux