Re: [PATCH 1/2] virtio-net: fix possible dim status unrecoverable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





在 2024/3/25 下午2:29, Jason Wang 写道:
On Mon, Mar 25, 2024 at 10:11 AM Heng Qi <hengqi@xxxxxxxxxxxxxxxxx> wrote:


在 2024/3/22 下午1:17, Jason Wang 写道:
On Thu, Mar 21, 2024 at 7:46 PM Heng Qi <hengqi@xxxxxxxxxxxxxxxxx> wrote:
When the dim worker is scheduled, if it fails to acquire the lock,
dim may not be able to return to the working state later.

For example, the following single queue scenario:
    1. The dim worker of rxq0 is scheduled, and the dim status is
       changed to DIM_APPLY_NEW_PROFILE;
    2. The ethtool command is holding rtnl lock;
    3. Since the rtnl lock is already held, virtnet_rx_dim_work fails
       to acquire the lock and exits;

Then, even if net_dim is invoked again, it cannot work because the
state is not restored to DIM_START_MEASURE.

Fixes: 6208799553a8 ("virtio-net: support rx netdim")
Signed-off-by: Heng Qi <hengqi@xxxxxxxxxxxxxxxxx>
---
   drivers/net/virtio_net.c | 4 +++-
   1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index c22d111..0ebe322 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3563,8 +3563,10 @@ static void virtnet_rx_dim_work(struct work_struct *work)
          struct dim_cq_moder update_moder;
          int i, qnum, err;

-       if (!rtnl_trylock())
+       if (!rtnl_trylock()) {
+               schedule_work(&dim->work);
                  return;
+       }
Patch looks fine but I wonder if a delayed schedule is better.
The work in net_dim() core layer uses non-delayed-work, and the two
cannot be mixed.
Well, I think we need first to figure out if delayed work is better here.

I tested a VM with 16 NICs, 128 queues per NIC (2kq total). With dim enabled on all queues, there are many opportunities for contention for rtnl lock, and this patch introduces no visible hotspots. The dim performance is also stable. So I think there doesn't seem to be a strong motivation right now.

Thanks,
Heng


Switching to use delayed work for dim seems not hard anyhow.

Thanks

Thanks,
Heng

Thanks

          /* Each rxq's work is queued by "net_dim()->schedule_work()"
           * in response to NAPI traffic changes. Note that dim->profile_ix
--
1.8.3.1






[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux