On 8/19/24 23:42, Mohamed Khalfella wrote:
Collecting crdump involves dumping vsc registers from pci config space
of mlx device. The code can run for long time starving other threads
want to run on the cpu. Added conditional reschedule between register
reads and while waiting for register value to release the cpu more
often.
Reviewed-by: Yuanyuan Zhong <yzhong@xxxxxxxxxxxxxxx>
Signed-off-by: Mohamed Khalfella <mkhalfella@xxxxxxxxxxxxxxx>
---
drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
index d0b595ba6110..377cc39643b4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
@@ -191,6 +191,7 @@ static int mlx5_vsc_wait_on_flag(struct mlx5_core_dev *dev, u8 expected_val)
if ((retries & 0xf) == 0)
usleep_range(1000, 2000);
+ cond_resched();
the sleeping logic above (including what is out of git diff context) is
a bit weird (tight loop with a sleep after each 16 attempts, with an
upper bound of 2k attempts!)
My understanding of usleep_range() is that it puts process to sleep
(and even leads to sched() call).
So cond_resched() looks redundant here.
} while (flag != expected_val);
return 0;
@@ -280,6 +281,7 @@ int mlx5_vsc_gw_read_block_fast(struct mlx5_core_dev *dev, u32 *data,
return read_addr;
read_addr = next_read_addr;
+ cond_resched();
Would be great to see how many registers there are/how long it takes to
dump them in commit message.
My guess is that a single mlx5_vsc_gw_read_fast() call is very short and
there are many. With that cond_resched() should be rather under some
if (iterator % XXX == 0) condition.
}
return length;
}