On Mon 04-01-21 21:17:43, Dan Williams wrote: > On Mon, Jan 4, 2021 at 2:45 AM David Hildenbrand <david@xxxxxxxxxx> wrote: [...] > > I believe Dan mentioned somewhere that he wants to see a real instance > > of this producing a BUG before actually moving forward with a fix. I > > might be wrong. > > I think I'm missing an argument for the user-visible effects of the > "Bad." statements above. I think soft_offline_page() is a candidate > for a local fix because mm/memory-failure.c already has a significant > amount of page-type specific knowledge. So teaching it "yes" for > MEMORY_DEVICE_PRIVATE-ZONE_DEVICE and "no" for other ZONE_DEVICE seems > ok to me. I believe we do not want to teach _every_ pfn walker about zone device pages. This would be quite error prone. Especially when a missig check could lead to a silently broken data or BUG_ON with debugging enabled (which is not the case for many production users). Or are we talking about different bugs here? -- Michal Hocko SUSE Labs