Re: [net-next v3 2/2] igc: Link queues to NAPI instances

Joe Damato <jdamato@xxxxxxxxxx> · Tue, 22 Oct 2024 13:28:22 -0700

On Fri, Oct 18, 2024 at 05:13:43PM +0000, Joe Damato wrote:
> Link queues to NAPI instances via netdev-genl API so that users can
> query this information with netlink. Handle a few cases in the driver:
>   1. Link/unlink the NAPIs when XDP is enabled/disabled
>   2. Handle IGC_FLAG_QUEUE_PAIRS enabled and disabled
> 
> Example output when IGC_FLAG_QUEUE_PAIRS is enabled:
> 
> $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
>                          --dump queue-get --json='{"ifindex": 2}'
> 
> [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
>  {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
>  {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'rx'},
>  {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'rx'},
>  {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'},
>  {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'tx'},
>  {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'},
>  {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}]
> 
> Since IGC_FLAG_QUEUE_PAIRS is enabled, you'll note that the same NAPI ID
> is present for both rx and tx queues at the same index, for example
> index 0:
> 
> {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
> {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'},
> 
> To test IGC_FLAG_QUEUE_PAIRS disabled, a test system was booted using
> the grub command line option "maxcpus=2" to force
> igc_set_interrupt_capability to disable IGC_FLAG_QUEUE_PAIRS.
> 
> Example output when IGC_FLAG_QUEUE_PAIRS is disabled:
> 
> $ lscpu | grep "On-line CPU"
> On-line CPU(s) list:      0,2
> 
> $ ethtool -l enp86s0  | tail -5
> Current hardware settings:
> RX:		n/a
> TX:		n/a
> Other:		1
> Combined:	2
> 
> $ cat /proc/interrupts  | grep enp
>  144: [...] enp86s0
>  145: [...] enp86s0-rx-0
>  146: [...] enp86s0-rx-1
>  147: [...] enp86s0-tx-0
>  148: [...] enp86s0-tx-1
> 
> 1 "other" IRQ, and 2 IRQs for each of RX and Tx, so we expect netlink to
> report 4 IRQs with unique NAPI IDs:
> 
> $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
>                          --dump napi-get --json='{"ifindex": 2}'
> [{'id': 8196, 'ifindex': 2, 'irq': 148},
>  {'id': 8195, 'ifindex': 2, 'irq': 147},
>  {'id': 8194, 'ifindex': 2, 'irq': 146},
>  {'id': 8193, 'ifindex': 2, 'irq': 145}]
> 
> Now we examine which queues these NAPIs are associated with, expecting
> that since IGC_FLAG_QUEUE_PAIRS is disabled each RX and TX queue will
> have its own NAPI instance:
> 
> $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
>                          --dump queue-get --json='{"ifindex": 2}'
> [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
>  {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
>  {'id': 0, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'},
>  {'id': 1, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}]
> 
> Signed-off-by: Joe Damato <jdamato@xxxxxxxxxx>
> ---
>  v3:
>    - Replace igc_unset_queue_napi with igc_set_queue_napi(adapater, i,
>      NULL), as suggested by Vinicius Costa Gomes
>    - Simplify implemention of igc_set_queue_napi as suggested by Kurt
>      Kanzenbach, with a tweak to use ring->queue_index
> 
>  v2:
>    - Update commit message to include tests for IGC_FLAG_QUEUE_PAIRS
>      disabled
>    - Refactored code to move napi queue mapping and unmapping to helper
>      functions igc_set_queue_napi and igc_unset_queue_napi
>    - Adjust the code to handle IGC_FLAG_QUEUE_PAIRS disabled
>    - Call helpers to map/unmap queues to NAPIs in igc_up, __igc_open,
>      igc_xdp_enable_pool, and igc_xdp_disable_pool
> 
>  drivers/net/ethernet/intel/igc/igc.h      |  2 ++
>  drivers/net/ethernet/intel/igc/igc_main.c | 33 ++++++++++++++++++++---
>  drivers/net/ethernet/intel/igc/igc_xdp.c  |  2 ++
>  3 files changed, 33 insertions(+), 4 deletions(-)

I took another look at this to make sure that RTNL is held when
igc_set_queue_napi is called after the e1000 bug report came in [1],
and there may be two locations I've missed:

1. igc_resume, which calls __igc_open
2. igc_io_error_detected, which calls igc_down

In both cases, I think the code can be modified to hold rtnl around
calls to __igc_open and igc_down.

Let me know what you think ?

If you agree that I should hold rtnl in both of those cases, what is
the best way to proceed:
  - send a v4, or
  - wait for this to get merged (since I got the notification it was
    pulled into intel-next) and send a fixes ?

Here's the full analysis I came up with; I tried to be thorough, but
it is certainly possible I missed a call site:

For the up case:

- igc_up:
  - called from igc_reinit_locked, which is called via:
    - igc_reset_task (rtnl is held)
    - igc_set_features (ndo_set_features, which itself has an ASSERT_RTNL)
    - various places in igc_ethtool (set_priv_flags, nway_reset,
      ethtool_set_eee) all of which have RTNL held
  - igc_change_mtu which also has RTNL held
- __igc_open
  - called from igc_resume, which may need an rtnl_lock ?
  - igc_open
    - called from igc_io_resume, rtnl is held
    - called from igc_reinit_queues, only via ethool set_channels,
      where rtnl is held
    - ndo_open where rtnl is held

For the down case:

- igc_down:
  - called from various ethtool locations (set_ringparam,
    set_pauseparam, set_link_ksettings) all of which hold rtnl
  - called from igc_io_error_detected, which may need an rtnl_lock
  - igc_reinit_locked which is fine, as described above
  - igc_change_mtu which is fine, as described above
  - called from __igc_close
    - called from __igc_shutdown which holds rtnl
    - called from igc_reinit_queues which is fine as described above
    - called from igc_close which is ndo_close