On 3/22/2022 10:45 PM, Christoph Hellwig wrote: > On Tue, Mar 22, 2022 at 11:05:09PM +0000, Jane Chu wrote: >>> This DAX_RECOVERY doesn't actually seem to be used anywhere here or >>> in the subsequent patches. Did I miss something? >> >> dax_iomap_iter() uses the flag in the same patch, >> + if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) { >> + flags |= DAX_RECOVERY; >> + map_len = dax_direct_access(dax_dev, pgoff, nrpg, >> + flags, &kaddr, NULL); > > Yes, it passes it on to dax_direct_access, and dax_direct_access passes > it onto ->direct_access. But nothing in this series actually checks > for it as far as I can tell. > >>>> Also introduce a new dev_pagemap_ops .recovery_write function. >>>> The function is applicable to FSDAX device only. The device >>>> page backend driver provides .recovery_write function if the >>>> device has underlying mechanism to clear the uncorrectable >>>> errors on the fly. >>> >>> Why is this not in struct dax_operations? >> >> Per Dan's comments to the v5 series, adding .recovery_write to >> dax_operations causes a number of trivial dm targets changes. >> Dan suggested that adding .recovery_write to pagemap_ops could >> cut short the logistics of figuring out whether the driver backing >> up a page is indeed capable of clearing poison. Please see >> https://lkml.org/lkml/2022/2/4/31 > > But at least in this series there is 1:1 association between the > pgmap and the dax_device so that scheme won't work. It would > have to lookup the pgmap based on the return physical address from > dax_direct_access. Which sounds more complicated than just adding > the (annoying) boilerplate code to DM. > Getting dax_direct_access to return pfn seems straight forward, what do you think of below change? --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -195,10 +195,10 @@ int dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff, } EXPORT_SYMBOL_GPL(dax_zero_page_range); -size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff, +size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff, pfn_t pfn, void *addr, size_t bytes, struct iov_iter *iter) { - struct dev_pagemap *pgmap = dax_dev->pgmap; + struct dev_pagemap *pgmap = get_dev_pagemap(pfn_t_to_pfn(pfn), NULL); if (!pgmap || !pgmap->ops->recovery_write) return -EIO; --- a/fs/dax.c +++ b/fs/dax.c @@ -1243,6 +1243,7 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi, int flags, recov; void *kaddr; long nrpg; + pfn_t pfn; if (fatal_signal_pending(current)) { ret = -EINTR; @@ -1257,7 +1258,7 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi, if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) { flags |= DAX_RECOVERY; map_len = dax_direct_access(dax_dev, pgoff, nrpg, - flags, &kaddr, NULL); + flags, &kaddr, &pfn); if (map_len > 0) recov++; } @@ -1273,7 +1274,7 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi, map_len = end - pos; if (recov) - xfer = dax_recovery_write(dax_dev, pgoff, kaddr, + xfer = dax_recovery_write(dax_dev, pgoff, pfn, kaddr, map_len, iter); else if (iov_iter_rw(iter) == WRITE) xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr, thanks! -jane