On 09.10.19 13:24, Michal Hocko wrote: > On Wed 09-10-19 12:19:59, David Hildenbrand wrote: > [...] >>> pfn_to_online_page makes sense because offline pages are not really in a >>> defined state. This would be worth a patch of its own. I remember there >> >> The issue is, once I check for pfn_to_online_page(), these functions >> can't handle ZONE_DEVICE at all anymore. Especially in regards to >> memory_failure() I don't think this is acceptable. > > Could you be more specific please? I am not sure I am following. I wasn't quite clear, let me try to be more precise: if (pfn_to_online_page(pfn)) { /* memmap initialized */ } else if (pfn_valid(pfn)) { /* ??? * a) offline memory. memmap garbage. * b) memremap memory: memmap initialized to ZONE_DEVICE. * c) memremap memory: reserved for driver. memmap garbage. * (d) memremap memory: memmap currently initializing - garbage) */ } To distinguish between a) and b/c), we can currently only use get_dev_pagemap(pfn, NULL). To distinguish between b) and c), we can currently only use pfn_zone_device_reserved(). That implies, that - right now - if we want to fix what is described in the patch without introducing more users of get_dev_pagemap(pfn, NULL), we will not be able to support ZONE_DEVICE in: - /proc/kpagecount - /proc/kpageflags - /proc/kpagecgroup if (pfn_to_online_page(pfn)) { /* memmap initialized */ } else { /* skip */ } Now, memory_failure() already contains a get_dev_pagemap(pfn, NULL) check and adding pfn_to_online_page(pfn) would also work. I would be fine with this, but it means that - for now - the three /proc/ files won't be able to deal with ZONE_DEVICE memory. > >> So while I >> (personally) only care about adding pfn_to_online_page() checks, the >> in-this-sense-fragile-subsection ZONE_DEVICE implementation requires me >> to introduce a temporary check for initialized memmaps. >> >>> was a discussion about the uninitialized zone device memmaps. It would >>> be really good to summarize this discussion in the changelog and >>> conclude why the explicit check is really good and what were other >>> alternatives considered. >> >> Yeah, I also expressed my feelings and the issues to be solved by >> ZONE_DEVICE people in https://lkml.org/lkml/2019/9/6/114. However, the >> discussion stalled and nobody really proposed a solution or followed up. > > I will try to get back to that discussion but is there any technical > reason that prevents any conclusion or it is just stuck on a lack of > time of the participants? I think it was both. People not responding to questions and not having decided on a solution. -- Thanks, David / dhildenb