Re: [RFC net] Fixes: b63c5478e9cb ("ipv6: ioam: Support for Queue depth data field")

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/5/22 17:50, Eric Dumazet wrote:
On Mon, Dec 5, 2022 at 5:30 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:

Patch title seems

On Mon, Dec 5, 2022 at 4:36 PM Justin Iurman <justin.iurman@xxxxxxxxx> wrote:

This patch fixes a NULL qdisc pointer when retrieving the TX queue depth
for IOAM.

IMPORTANT: I suspect this fix is local only and the bug goes deeper (see
reasoning below).

Kernel panic:
[...]
RIP: 0010:ioam6_fill_trace_data+0x54f/0x5b0
[...]

...which basically points to the call to qdisc_qstats_qlen_backlog
inside net/ipv6/ioam6.c.

 From there, I directly thought of a NULL pointer (queue->qdisc). To make
sure, I added some printk's to know exactly *why* and *when* it happens.
Here is the (summarized by queue) output:

skb for TX queue 1, qdisc is ffff8b375eee9800, qdisc_sleeping is ffff8b375eee9800
skb for TX queue 2, qdisc is ffff8b375eeefc00, qdisc_sleeping is ffff8b375eeefc00
skb for TX queue 3, qdisc is ffff8b375eeef800, qdisc_sleeping is ffff8b375eeef800
skb for TX queue 4, qdisc is ffff8b375eeec800, qdisc_sleeping is ffff8b375eeec800
skb for TX queue 5, qdisc is ffff8b375eeea400, qdisc_sleeping is ffff8b375eeea400
skb for TX queue 6, qdisc is ffff8b375eeee000, qdisc_sleeping is ffff8b375eeee000
skb for TX queue 7, qdisc is ffff8b375eee8800, qdisc_sleeping is ffff8b375eee8800
skb for TX queue 8, qdisc is ffff8b375eeedc00, qdisc_sleeping is ffff8b375eeedc00
skb for TX queue 9, qdisc is ffff8b375eee9400, qdisc_sleeping is ffff8b375eee9400
skb for TX queue 10, qdisc is ffff8b375eee8000, qdisc_sleeping is ffff8b375eee8000
skb for TX queue 11, qdisc is ffff8b375eeed400, qdisc_sleeping is ffff8b375eeed400
skb for TX queue 12, qdisc is ffff8b375eeea800, qdisc_sleeping is ffff8b375eeea800
skb for TX queue 13, qdisc is ffff8b375eee8c00, qdisc_sleeping is ffff8b375eee8c00
skb for TX queue 14, qdisc is ffff8b375eeea000, qdisc_sleeping is ffff8b375eeea000
skb for TX queue 15, qdisc is ffff8b375eeeb800, qdisc_sleeping is ffff8b375eeeb800
skb for TX queue 16, qdisc is NULL, qdisc_sleeping is NULL

What the hell? So, not sure why queue #16 would *never* have a qdisc
attached. Is it something expected I'm not aware of? As an FYI, here is
the output of "tc qdisc list dev xxx":

qdisc mq 0: root
qdisc fq_codel 0: parent :10 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :f limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :e limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :d limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :c limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :b limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :a limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :9 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :8 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :7 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :6 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :5 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :3 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64

By the way, the NIC is an Intel XL710 40GbE QSFP+ (i40e driver, firmware
version 8.50 0x8000b6c7 1.3082.0) and it was tested on latest "net"
version (6.1.0-rc7+). Is this a bug in the i40e driver?


Cc: stable@xxxxxxxxxxxxxxx

Patch title is mangled. The Fixes: tag should appear here, not in the title.


Fixes: b63c5478e9cb ("ipv6: ioam: Support for Queue depth data field")

Signed-off-by: Justin Iurman <justin.iurman@xxxxxxxxx>
---
  net/ipv6/ioam6.c | 11 +++++++----
  1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/ioam6.c b/net/ipv6/ioam6.c
index 571f0e4d9cf3..2472a8a043c4 100644
--- a/net/ipv6/ioam6.c
+++ b/net/ipv6/ioam6.c
@@ -727,10 +727,13 @@ static void __ioam6_fill_trace_data(struct sk_buff *skb,
                         *(__be32 *)data = cpu_to_be32(IOAM6_U32_UNAVAILABLE);
                 } else {
                         queue = skb_get_tx_queue(skb_dst(skb)->dev, skb);

Are you sure skb_dst(skb)->dev is correct at this stage, what about
stacked devices ?

-                       qdisc = rcu_dereference(queue->qdisc);
-                       qdisc_qstats_qlen_backlog(qdisc, &qlen, &backlog);
-
-                       *(__be32 *)data = cpu_to_be32(backlog);
+                       if (!queue->qdisc) {

This is racy.

+                               *(__be32 *)data = cpu_to_be32(IOAM6_U32_UNAVAILABLE);
+                       } else {
+                               qdisc = rcu_dereference(queue->qdisc);
+                               qdisc_qstats_qlen_backlog(qdisc, &qlen, &backlog);
+                               *(__be32 *)data = cpu_to_be32(backlog);
+                       }
                 }
                 data += sizeof(__be32);
         }
--
2.25.1


Quite frankly I suggest to revert b63c5478e9cb completely.

The notion of Queue depth can not be properly gathered in Linux with a
multi queue model,
so why trying to get a wrong value ?

Additional reason for a revert is that qdisc_qstats_qlen_backlog() is
reserved for net/sched

If by "reserved" you mean "only used by at the moment", then yes (with the only exception being IOAM). But some other functions are defined as well, and some are used elsewhere than the "net/sched" context. So I don't think it's really an issue to use this function "from somewhere else".

code, I think it needs the qdisc lock to be held.

Good point. But is it really needed when called with rcu_read_lock?



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux