Patch "idpf: convert workqueues to unbound" has been added to the 6.13-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    idpf: convert workqueues to unbound

to the 6.13-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     idpf-convert-workqueues-to-unbound.patch
and it can be found in the queue-6.13 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 5d3ecd433a87185189d38cb3c2848e8069ffe1f2
Author: Marco Leogrande <leogrande@xxxxxxxxxx>
Date:   Mon Dec 16 16:27:34 2024 +0000

    idpf: convert workqueues to unbound
    
    [ Upstream commit 9a5b021cb8186f1854bac2812bd4f396bb1e881c ]
    
    When a workqueue is created with `WQ_UNBOUND`, its work items are
    served by special worker-pools, whose host workers are not bound to
    any specific CPU. In the default configuration (i.e. when
    `queue_delayed_work` and friends do not specify which CPU to run the
    work item on), `WQ_UNBOUND` allows the work item to be executed on any
    CPU in the same node of the CPU it was enqueued on. While this
    solution potentially sacrifices locality, it avoids contention with
    other processes that might dominate the CPU time of the processor the
    work item was scheduled on.
    
    This is not just a theoretical problem: in a particular scenario
    misconfigured process was hogging most of the time from CPU0, leaving
    less than 0.5% of its CPU time to the kworker. The IDPF workqueues
    that were using the kworker on CPU0 suffered large completion delays
    as a result, causing performance degradation, timeouts and eventual
    system crash.
    
    Tested:
    
    * I have also run a manual test to gauge the performance
      improvement. The test consists of an antagonist process
      (`./stress --cpu 2`) consuming as much of CPU 0 as possible. This
      process is run under `taskset 01` to bind it to CPU0, and its
      priority is changed with `chrt -pQ 9900 10000 ${pid}` and
      `renice -n -20 ${pid}` after start.
    
      Then, the IDPF driver is forced to prefer CPU0 by editing all calls
      to `queue_delayed_work`, `mod_delayed_work`, etc... to use CPU 0.
    
      Finally, `ktraces` for the workqueue events are collected.
    
      Without the current patch, the antagonist process can force
      arbitrary delays between `workqueue_queue_work` and
      `workqueue_execute_start`, that in my tests were as high as
      `30ms`. With the current patch applied, the workqueue can be
      migrated to another unloaded CPU in the same node, and, keeping
      everything else equal, the maximum delay I could see was `6us`.
    
    Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration")
    Signed-off-by: Marco Leogrande <leogrande@xxxxxxxxxx>
    Signed-off-by: Manoj Vishwanathan <manojvishy@xxxxxxxxxx>
    Signed-off-by: Brian Vazquez <brianvv@xxxxxxxxxx>
    Reviewed-by: Jacob Keller <jacob.e.keller@xxxxxxxxx>
    Reviewed-by: Pavan Kumar Linga <pavan.kumar.linga@xxxxxxxxx>
    Tested-by: Krishneil Singh <krishneil.k.singh@xxxxxxxxx>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@xxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/net/ethernet/intel/idpf/idpf_main.c b/drivers/net/ethernet/intel/idpf/idpf_main.c
index f71d3182580b6..b6c515d14cbf0 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_main.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_main.c
@@ -174,7 +174,8 @@ static int idpf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	pci_set_master(pdev);
 	pci_set_drvdata(pdev, adapter);
 
-	adapter->init_wq = alloc_workqueue("%s-%s-init", 0, 0,
+	adapter->init_wq = alloc_workqueue("%s-%s-init",
+					   WQ_UNBOUND | WQ_MEM_RECLAIM, 0,
 					   dev_driver_string(dev),
 					   dev_name(dev));
 	if (!adapter->init_wq) {
@@ -183,7 +184,8 @@ static int idpf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err_free;
 	}
 
-	adapter->serv_wq = alloc_workqueue("%s-%s-service", 0, 0,
+	adapter->serv_wq = alloc_workqueue("%s-%s-service",
+					   WQ_UNBOUND | WQ_MEM_RECLAIM, 0,
 					   dev_driver_string(dev),
 					   dev_name(dev));
 	if (!adapter->serv_wq) {
@@ -192,7 +194,8 @@ static int idpf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err_serv_wq_alloc;
 	}
 
-	adapter->mbx_wq = alloc_workqueue("%s-%s-mbx", 0, 0,
+	adapter->mbx_wq = alloc_workqueue("%s-%s-mbx",
+					  WQ_UNBOUND | WQ_MEM_RECLAIM, 0,
 					  dev_driver_string(dev),
 					  dev_name(dev));
 	if (!adapter->mbx_wq) {
@@ -201,7 +204,8 @@ static int idpf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err_mbx_wq_alloc;
 	}
 
-	adapter->stats_wq = alloc_workqueue("%s-%s-stats", 0, 0,
+	adapter->stats_wq = alloc_workqueue("%s-%s-stats",
+					    WQ_UNBOUND | WQ_MEM_RECLAIM, 0,
 					    dev_driver_string(dev),
 					    dev_name(dev));
 	if (!adapter->stats_wq) {
@@ -210,7 +214,8 @@ static int idpf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto err_stats_wq_alloc;
 	}
 
-	adapter->vc_event_wq = alloc_workqueue("%s-%s-vc_event", 0, 0,
+	adapter->vc_event_wq = alloc_workqueue("%s-%s-vc_event",
+					       WQ_UNBOUND | WQ_MEM_RECLAIM, 0,
 					       dev_driver_string(dev),
 					       dev_name(dev));
 	if (!adapter->vc_event_wq) {




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux