On Tue, Jun 23, 2020 at 1:18 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > Hardware actually tells us the blast radius of the error, but we ignore > it and take out the entire page. We've had a customer request to know > exactly how much of the page is damaged so they can avoid reconstructing > an entire 2MB page if only a single cacheline is damaged. > > This is only a strawman that I did in an hour or two; I'd appreciate > architectural-level feedback. Should I just convert memory_failure() to > always take an address & granularity? Should I create a struct to pass > around (page, phys, granularity) instead of reconstructing the missing > pieces in half a dozen functions? Is this functionality welcome at all, > or is the risk of upsetting applications which expect at least a page > of granularity too high? > > I can see places where I've specified a plain PAGE_SHIFT insted of > interrogating a compound page for its size. I'd probably split this > patch up into two or three pieces for applying. > > I've also blindly taken out the call to unmap_mapping_range(). Again, > the customer requested that we not do this. That deserves to be in its > own patch and properly justified. I had been thinking that we could not do much with the legacy memory-failure reporting model and that applications that want a new model would need to opt-into it. This topic also dovetails with what Dave and I had been discussing in terms coordinating memory error handling with the filesystem which may have more information about multiple mappings of a DAX page (reflink) [1]. [1]: http://lore.kernel.org/r/20200311063942.GE10776@xxxxxxxxxxxxxxxxxxx