On 9/27/2021 2:34 PM, Leon Romanovsky wrote:
On Sun, Sep 26, 2021 at 05:55:18PM +0300, Max Gurtovoy wrote:
To optimize performance, set the affinity of the block device tagset
according to the virtio device affinity.
Signed-off-by: Max Gurtovoy <mgurtovoy@xxxxxxxxxx>
---
drivers/block/virtio_blk.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 9b3bd083b411..1c68c3e0ebf9 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -774,7 +774,7 @@ static int virtblk_probe(struct virtio_device *vdev)
memset(&vblk->tag_set, 0, sizeof(vblk->tag_set));
vblk->tag_set.ops = &virtio_mq_ops;
vblk->tag_set.queue_depth = queue_depth;
- vblk->tag_set.numa_node = NUMA_NO_NODE;
+ vblk->tag_set.numa_node = virtio_dev_to_node(vdev);
I afraid that by doing it, you will increase chances to see OOM, because
in NUMA_NO_NODE, MM will try allocate memory in whole system, while in
the latter mode only on specific NUMA which can be depleted.
This is a common methodology we use in the block layer and in NVMe
subsystem and we don't afraid of the OOM issue you raised.
This is not new and I guess that the kernel MM will (or should) be
handling the fallback you raised.
Anyway, if we're doing this in NVMe I don't see a reason to afraid doing
it in virtio-blk.
Also, I've send a patch that decrease the size of the memory consumption
for virtio-blk few weeks ago so I guess we'll be just fine.
Thanks
vblk->tag_set.flags = BLK_MQ_F_SHOULD_MERGE;
vblk->tag_set.cmd_size =
sizeof(struct virtblk_req) +
--
2.18.1