Re: Kernel oops with 6.14 when enabling TLS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/4/25 11:26, Vlastimil Babka wrote:
On 3/4/25 11:20, Hannes Reinecke wrote:

[ .. ]
So I'd be happy with an 'easy' fix for now. Obviously :-)


With this patch:

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 65f550cb5081..b035a9928cdd 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1190,8 +1190,14 @@ static ssize_t __iov_iter_get_pages_alloc(struct iov_iter *i,
                if (!n)
                        return -ENOMEM;
                p = *pages;
-               for (int k = 0; k < n; k++)
-                       get_page(p[k] = page + k);
+               for (int k = 0; k < n; k++) {
+                       if (!get_page_unless_zero(p[k] = page + k)) {
+                               pr_warn("%s: frozen page %d of %d\n",
+                                       __func__, k, n);
+                               return -ENOMEM;
+                       }
+               }
+
                maxsize = min_t(size_t, maxsize, n * PAGE_SIZE - *start);
                i->count -= maxsize;
                i->iov_offset += maxsize;


the system doesn't crash anymore:
[   51.520949] __iov_iter_get_pages_alloc: frozen page 0 of 1
[   51.536393] nvme nvme0: creating 4 I/O queues.
[   51.968897] nvme nvme0: mapped 4/0/0 default/read/poll queues.
[   51.972207] __iov_iter_get_pages_alloc: frozen page 0 of 1
[   51.974528] __iov_iter_get_pages_alloc: frozen page 0 of 1
[   51.976928] __iov_iter_get_pages_alloc: frozen page 0 of 1
[   51.978980] __iov_iter_get_pages_alloc: frozen page 0 of 1
[ 51.981236] nvme nvme0: new ctrl: NQN "nqn.blktests-subsystem-1", addr 10.161.9.19:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:027a49dc-b554-40e5-b0f9-0a9ea03ec30c

and the allocation in question is coming from
drivers/nvme/host/fabrics.c:nvmf_connect_data_prep(), which
coincidentally _is_ a kmalloc()ed buffer.

But TLS doesn't work, either:

[ 58.886754] nvme nvme0: I/O tag 1 (3001) type 4 opcode 0x18 (Keep Alive) QID 0 timeout
[   58.889112] nvme nvme0: starting error recovery
[   58.892176] nvme nvme0: failed nvme_keep_alive_end_io error=10
[   58.892282] nvme nvme0: reading non-mdts-limits failed: -4
[   58.902490] nvme nvme0: Reconnecting in 10 seconds...

(probably not surprising seeing that an error is returned ..)

So yeah, looks like TLS has issues with kmalloced data.

Cheers,

Hannes
--
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@xxxxxxx                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux