On Mon, Dec 5, 2022 at 5:30 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote: > > Patch title seems > > On Mon, Dec 5, 2022 at 4:36 PM Justin Iurman <justin.iurman@xxxxxxxxx> wrote: > > > > This patch fixes a NULL qdisc pointer when retrieving the TX queue depth > > for IOAM. > > > > IMPORTANT: I suspect this fix is local only and the bug goes deeper (see > > reasoning below). > > > > Kernel panic: > > [...] > > RIP: 0010:ioam6_fill_trace_data+0x54f/0x5b0 > > [...] > > > > ...which basically points to the call to qdisc_qstats_qlen_backlog > > inside net/ipv6/ioam6.c. > > > > From there, I directly thought of a NULL pointer (queue->qdisc). To make > > sure, I added some printk's to know exactly *why* and *when* it happens. > > Here is the (summarized by queue) output: > > > > skb for TX queue 1, qdisc is ffff8b375eee9800, qdisc_sleeping is ffff8b375eee9800 > > skb for TX queue 2, qdisc is ffff8b375eeefc00, qdisc_sleeping is ffff8b375eeefc00 > > skb for TX queue 3, qdisc is ffff8b375eeef800, qdisc_sleeping is ffff8b375eeef800 > > skb for TX queue 4, qdisc is ffff8b375eeec800, qdisc_sleeping is ffff8b375eeec800 > > skb for TX queue 5, qdisc is ffff8b375eeea400, qdisc_sleeping is ffff8b375eeea400 > > skb for TX queue 6, qdisc is ffff8b375eeee000, qdisc_sleeping is ffff8b375eeee000 > > skb for TX queue 7, qdisc is ffff8b375eee8800, qdisc_sleeping is ffff8b375eee8800 > > skb for TX queue 8, qdisc is ffff8b375eeedc00, qdisc_sleeping is ffff8b375eeedc00 > > skb for TX queue 9, qdisc is ffff8b375eee9400, qdisc_sleeping is ffff8b375eee9400 > > skb for TX queue 10, qdisc is ffff8b375eee8000, qdisc_sleeping is ffff8b375eee8000 > > skb for TX queue 11, qdisc is ffff8b375eeed400, qdisc_sleeping is ffff8b375eeed400 > > skb for TX queue 12, qdisc is ffff8b375eeea800, qdisc_sleeping is ffff8b375eeea800 > > skb for TX queue 13, qdisc is ffff8b375eee8c00, qdisc_sleeping is ffff8b375eee8c00 > > skb for TX queue 14, qdisc is ffff8b375eeea000, qdisc_sleeping is ffff8b375eeea000 > > skb for TX queue 15, qdisc is ffff8b375eeeb800, qdisc_sleeping is ffff8b375eeeb800 > > skb for TX queue 16, qdisc is NULL, qdisc_sleeping is NULL > > > > What the hell? So, not sure why queue #16 would *never* have a qdisc > > attached. Is it something expected I'm not aware of? As an FYI, here is > > the output of "tc qdisc list dev xxx": > > > > qdisc mq 0: root > > qdisc fq_codel 0: parent :10 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :f limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :e limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :d limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :c limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :b limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :a limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :9 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :8 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :7 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :6 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :5 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :3 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 > > > > By the way, the NIC is an Intel XL710 40GbE QSFP+ (i40e driver, firmware > > version 8.50 0x8000b6c7 1.3082.0) and it was tested on latest "net" > > version (6.1.0-rc7+). Is this a bug in the i40e driver? > > > > > Cc: stable@xxxxxxxxxxxxxxx > > Patch title is mangled. The Fixes: tag should appear here, not in the title. > > > Fixes: b63c5478e9cb ("ipv6: ioam: Support for Queue depth data field") > > > Signed-off-by: Justin Iurman <justin.iurman@xxxxxxxxx> > > --- > > net/ipv6/ioam6.c | 11 +++++++---- > > 1 file changed, 7 insertions(+), 4 deletions(-) > > > > diff --git a/net/ipv6/ioam6.c b/net/ipv6/ioam6.c > > index 571f0e4d9cf3..2472a8a043c4 100644 > > --- a/net/ipv6/ioam6.c > > +++ b/net/ipv6/ioam6.c > > @@ -727,10 +727,13 @@ static void __ioam6_fill_trace_data(struct sk_buff *skb, > > *(__be32 *)data = cpu_to_be32(IOAM6_U32_UNAVAILABLE); > > } else { > > queue = skb_get_tx_queue(skb_dst(skb)->dev, skb); > > Are you sure skb_dst(skb)->dev is correct at this stage, what about > stacked devices ? > > > - qdisc = rcu_dereference(queue->qdisc); > > - qdisc_qstats_qlen_backlog(qdisc, &qlen, &backlog); > > - > > - *(__be32 *)data = cpu_to_be32(backlog); > > + if (!queue->qdisc) { > > This is racy. > > > + *(__be32 *)data = cpu_to_be32(IOAM6_U32_UNAVAILABLE); > > + } else { > > + qdisc = rcu_dereference(queue->qdisc); > > + qdisc_qstats_qlen_backlog(qdisc, &qlen, &backlog); > > + *(__be32 *)data = cpu_to_be32(backlog); > > + } > > } > > data += sizeof(__be32); > > } > > -- > > 2.25.1 > > > > Quite frankly I suggest to revert b63c5478e9cb completely. > > The notion of Queue depth can not be properly gathered in Linux with a > multi queue model, > so why trying to get a wrong value ? Additional reason for a revert is that qdisc_qstats_qlen_backlog() is reserved for net/sched code, I think it needs the qdisc lock to be held.