Re: [Xen-devel] [PATCH v10 2/4] xen/blkback: Squeeze page pools if a memory pressure is detected

Jürgen Groß <jgross@xxxxxxxx> · Mon, 16 Dec 2019 17:23:44 +0100

On 16.12.19 17:15, SeongJae Park wrote:
On Mon, 16 Dec 2019 15:37:20 +0100 SeongJae Park <sjpark@xxxxxxxxxx> wrote:

On Mon, 16 Dec 2019 13:45:25 +0100 SeongJae Park <sjpark@xxxxxxxxxx> wrote:

From: SeongJae Park <sjpark@xxxxxxxxx>

[...]

--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -824,6 +824,24 @@ static void frontend_changed(struct xenbus_device *dev,
  }
  
  
+/* Once a memory pressure is detected, squeeze free page pools for a while. */
+static unsigned int buffer_squeeze_duration_ms = 10;
+module_param_named(buffer_squeeze_duration_ms,
+		buffer_squeeze_duration_ms, int, 0644);
+MODULE_PARM_DESC(buffer_squeeze_duration_ms,
+"Duration in ms to squeeze pages buffer when a memory pressure is detected");
+
+/*
+ * Callback received when the memory pressure is detected.
+ */
+static void reclaim_memory(struct xenbus_device *dev)
+{
+	struct backend_info *be = dev_get_drvdata(&dev->dev);
+
+	be->blkif->buffer_squeeze_end = jiffies +
+		msecs_to_jiffies(buffer_squeeze_duration_ms);

This callback might race with 'xen_blkbk_probe()'.  The race could result in
__NULL dereferencing__, as 'xen_blkbk_probe()' sets '->blkif' after it links
'be' to the 'dev'.  Please _don't merge_ this patch now!

I will do more test and share results.  Meanwhile, if you have any opinion,
please let me know.

Not only '->blkif', but 'be' itself also coule be a NULL.  As similar
concurrency issues could be in other drivers in their way, I suggest to change
the reclaim callback ('->reclaim_memory') to be called for each driver instead
of each device.  Then, each driver could be able to deal with its concurrency
issues by itself.

Hmm, I don't like that. This would need to be changed back in case we
add per-guest quota.

Wouldn't a get_device() before calling the callback and a put_device()
afterwards avoid that problem?


Juergen