On Wed, Mar 1, 2017 at 4:51 AM, Heiko Carstens <heiko.carstens@xxxxxxxxxx> wrote: > On Tue, Feb 28, 2017 at 12:57:29PM +0100, Heiko Carstens wrote: >> On Mon, Feb 27, 2017 at 05:20:31PM +0100, Michal Hocko wrote: >> > [CC Rafael] >> > >> > I've got lost in the acpi indirection (again). I can see >> > acpi_device_hotplug calling lock_device_hotplug() but i cannot find a >> > path down to add_memory() which might call add_memory_resource. But the >> > patch below sounds suspicious to me. Is it possible that this could lead >> > to a deadlock. I would suspect that it is the s390 code which needs to >> > do the locking. But I would have to double check - it is really easy to >> > get lost there. >> >> To me it rather looks like bfc8c90139eb ("mem-hotplug: implement >> get/put_online_mems") introduced quite subtle and probably wrong locking >> rules. >> >> The patch introduced mem_hotplug_begin() in order to have something like >> cpu_hotplug_begin() for memory. Note that for cpu hotplug all >> cpu_hotplug_begin() calls are serialized by cpu_maps_update_begin(). >> >> Especially this makes sure that active_writer can only be changed by one >> process. (See also Dan's commit which introduced the lock_device_hotplug() >> calls: https://marc.info/?l=linux-kernel&m=148693912419972&w=2 ) >> >> If you look at the above commit bfc8c90139eb: there is nothing like >> cpu_maps_update_begin() for memory. And therefore it's possible to have >> concurrent writers to active_writer. >> >> It looks like now lock_device_hotplug() is supposed to be the new >> cpu_maps_update_begin() for memory. But.. this looks like a mess, unless I >> read the code completely wrong ;) > > [Full quote since I now hopefully use a non-bouncing email address from > Vladimir] > > Since it is anything but obvious why Dan wrote in changelog of b5d24fda9c3d > ("mm, devm_memremap_pages: hold device_hotplug lock over > mem_hotplug_{begin, done}") that write accesses to > mem_hotplug.active_writer are coordinated via lock_device_hotplug() I'd > rather propose a new private memory_add_remove_lock which has similar > semantics like the cpu_add_remove_lock for cpu hotplug (see patch below). > > However instead of sprinkling locking/unlocking of that new lock around all > calls of mem_hotplug_begin() and mem_hotplug_end() simply include locking > and unlocking into these two functions. > > This still allows get_online_mems() and put_online_mems() to work, while at > the same time preventing mem_hotplug.active_writer corruption. > > Any opinions? Sorry, yes, I didn't make it clear that I derived that locking requirement from store_mem_state() and its usage of lock_device_hotplug_sysfs(). That routine is trying very hard not trip the soft-lockup detector. It seems like that wants to be an interruptible wait. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>