On 06/08/2024 11:09, Sagi Grimberg wrote:
On 06/08/2024 7:43, Tariq Toukan wrote:
On 05/08/2024 14:43, Sagi Grimberg wrote:
On 05/08/2024 13:40, Tariq Toukan wrote:
Hi,
A recent patch [1] to 'fs' broke the TX TLS device-offloaded flow
starting from v6.11-rc1.
The kernel crashes. Different runs result in different kernel traces.
See below [2].
All of them disappear once patch [1] is reverted.
The issues appears only with "sendfile on and zerocopy on".
We couldn't repro with "sendfile off", or with "sendfile on and
zerocopy off".
The repro test is as simple as a repeated client/server
communication (wrk/nginx), with sendfile on and zc on, and with
"tls-hw-tx-offload: on".
$ for i in `seq 10`; do wrk -b::2:2:2:3 -t10 -c100 -d15 --timeout 5s
https://[::2:2:2:2]:20448/16000b.img; done
We can provide more details if needed, to help with the analysis and
debug.
Does tls sw (i.e. no offload) also break?
No it doesn't.
Only the "sendfile with ZC" flow of the TX device-offloaded TLS.
Adding Maxim Mikityanskiy, he might have some insights.
Not familiar with the TLS offload code, are there any assumptions on
PAGE_SIZE contig buffers? Or assumptions on individual
page references/lifetime?
The sporadic panics you reported look like a result of memory corruption
or use-after-free conditions.