On Tue, 17 Dec 2019 18:10:19 +0100 "Jürgen Groß" <jgross@xxxxxxxx> wrote: > On 17.12.19 17:24, SeongJae Park wrote: > > On Tue, 17 Dec 2019 17:13:42 +0100 "Jürgen Groß" <jgross@xxxxxxxx> wrote: > > > >> On 17.12.19 17:07, SeongJae Park wrote: > >>> From: SeongJae Park <sjpark@xxxxxxxxx> > >>> > >>> 'reclaim_memory' callback can race with a driver code as this callback > >>> will be called from any memory pressure detected context. To deal with > >>> the case, this commit adds a spinlock in the 'xenbus_device'. Whenever > >>> 'reclaim_memory' callback is called, the lock of the device which passed > >>> to the callback as its argument is locked. Thus, drivers registering > >>> their 'reclaim_memory' callback should protect the data that might race > >>> with the callback with the lock by themselves. > >>> > >>> Signed-off-by: SeongJae Park <sjpark@xxxxxxxxx> > >>> --- > >>> drivers/xen/xenbus/xenbus_probe.c | 1 + > >>> drivers/xen/xenbus/xenbus_probe_backend.c | 10 ++++++++-- > >>> include/xen/xenbus.h | 2 ++ > >>> 3 files changed, 11 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c > >>> index 5b471889d723..b86393f172e6 100644 > >>> --- a/drivers/xen/xenbus/xenbus_probe.c > >>> +++ b/drivers/xen/xenbus/xenbus_probe.c > >>> @@ -472,6 +472,7 @@ int xenbus_probe_node(struct xen_bus_type *bus, > >>> goto fail; > >>> > >>> dev_set_name(&xendev->dev, "%s", devname); > >>> + spin_lock_init(&xendev->reclaim_lock); > >>> > >>> /* Register with generic device framework. */ > >>> err = device_register(&xendev->dev); > >>> diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c > >>> index 7e78ebef7c54..516aa64b9967 100644 > >>> --- a/drivers/xen/xenbus/xenbus_probe_backend.c > >>> +++ b/drivers/xen/xenbus/xenbus_probe_backend.c > >>> @@ -251,12 +251,18 @@ static int backend_probe_and_watch(struct notifier_block *notifier, > >>> static int backend_reclaim_memory(struct device *dev, void *data) > >>> { > >>> const struct xenbus_driver *drv; > >>> + struct xenbus_device *xdev; > >>> + unsigned long flags; > >>> > >>> if (!dev->driver) > >>> return 0; > >>> drv = to_xenbus_driver(dev->driver); > >>> - if (drv && drv->reclaim_memory) > >>> - drv->reclaim_memory(to_xenbus_device(dev)); > >>> + if (drv && drv->reclaim_memory) { > >>> + xdev = to_xenbus_device(dev); > >>> + spin_trylock_irqsave(&xdev->reclaim_lock, flags); > >> > >> You need spin_lock_irqsave() here. Or maybe spin_lock() would be fine, > >> too? I can't see a reason why you'd want to disable irqs here. > > > > I needed to diable irq here as this is called from the memory shrinker context. > > Okay. > > > > > Also, used 'trylock' because the 'probe()' and 'remove()' code of the driver > > might include memory allocation. And the xen-blkback actually does. If the > > allocation shows a memory pressure during the allocation, it will trigger this > > shrinker callback again and then deadlock. > > In that case you need to either return when you didn't get the lock or Yes, it should. Cannot believe how I posted this code. Seems I made some terrible mistake while formatting patches. Anyway, will return if fail to acquire the lock, in the next version. Thanks, SeongJae Park > > - when obtaining the lock during probe() and remove() set a variable > containing the current cpu number > - and reset that to e.g NR_CPUS before releasing the lock again > - in the shrinker callback do trylock, and if you didn't get the lock > test whether the cpu-variable above is set to your current cpu and > continue only if yes; if not, redo the the trylock > > > Juergen