On Mon, 22 May 2017, Michal Hocko wrote: > On Mon 22-05-17 08:00:11, Mikulas Patocka wrote: > > > > On Mon, 22 May 2017, Michal Hocko wrote: > > > > > > Sometimes, I/O to a device mapper device is blocked until the userspace > > > > daemon dmeventd does some action (for example, when dm-mirror leg fails, > > > > dmeventd needs to mark the leg as failed in the lvm metadata and then > > > > reload the device). > > > > > > > > The dmeventd daemon mlocks itself in memory so that it doesn't generate > > > > any I/O. But it must be able to call ioctls. __GFP_HIGH is there so that > > > > the ioctls issued by dmeventd have higher chance of succeeding if some I/O > > > > is blocked, waiting for dmeventd action. It reduces the possibility of > > > > low-memory-deadlock, though it doesn't eliminate it entirely. > > > > > > So what happens if the memory reserves are depleted. Do we deadlock? > > > > Yes, it will deadlock. > > That would be more than unfortunate and begs for a different solution. > The thing is that __GFP_HIGH is not propagated to all allocations in the > vmalloc proper. E.g. page table allocations are hardcoded GFP_KERNEL. For a typical device mapper use, the ioctl area is smaller than 4k, so the vmalloc won't happen. > > > Why is OOM killer insufficient to allow the further progress? > > > > I don't know if the OOM killer will or won't be triggered in this > > situation, it depends on the people who wrote the OOM killer. > > I am not sure I understand. OOM killer is invoked for _all_ allocations > <= PAGE_ALLOC_COSTLY_ORDER that do not have __GFP_NORETRY as long as the > OOM killer is not disabled (oom_killer_disable) and that only happens > from the PM suspend path which makes sure that no userspace is active at > the time. AFAIU this is a userspace triggered path and so the later > shouldn't apply to it and GFP_KERNEL should be therefore sufficient. > Relying to a portion of memory reserves to prevent from deadlock seems > fundamentaly broken to me. > > -- > Michal Hocko > SUSE Labs The lvm2 was designed this way - it is broken, but there is not much that can be done about it - fixing this would mean major rewrite. The only thing we can do about it is to lower the deadlock probability with __GFP_HIGH (or PF_MEMALLOC that was used some times ago). Mikulas -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>