On Sat, Mar 19, 2022 at 12:28:31AM -0600, Jane Chu wrote: > Introduce DAX_RECOVERY flag to dax_direct_access(). The flag is > not set by default in dax_direct_access() such that the helper > does not translate a pmem range to kernel virtual address if the > range contains uncorrectable errors. When the flag is set, > the helper ignores the UEs and return kernel virtual adderss so > that the caller may get on with data recovery via write. This DAX_RECOVERY doesn't actually seem to be used anywhere here or in the subsequent patches. Did I miss something? > Also introduce a new dev_pagemap_ops .recovery_write function. > The function is applicable to FSDAX device only. The device > page backend driver provides .recovery_write function if the > device has underlying mechanism to clear the uncorrectable > errors on the fly. Why is this not in struct dax_operations? > > +size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff, > + void *addr, size_t bytes, struct iov_iter *iter) > +{ > + struct dev_pagemap *pgmap = dax_dev->pgmap; > + > + if (!pgmap || !pgmap->ops->recovery_write) > + return -EIO; > + return pgmap->ops->recovery_write(pgmap, pgoff, addr, bytes, > + (void *)iter); No need to cast a type pointer to a void pointer. But more importantly losing the type information here and passing it as void seems very wrong. > +static size_t pmem_recovery_write(struct dev_pagemap *pgmap, pgoff_t pgoff, > + void *addr, size_t bytes, void *iter) > +{ > + struct pmem_device *pmem = pgmap->owner; > + > + dev_warn(pmem->bb.dev, "%s: not yet implemented\n", __func__); > + > + /* XXX more later */ > + return 0; > +} This shuld not be added here - the core code can cope with a NULL method just fine. > + recov = 0; > + flags = 0; > + nrpg = PHYS_PFN(size); Please spell out the words. The recovery flag can also be a bool to make the code more readable. > + map_len = dax_direct_access(dax_dev, pgoff, nrpg, flags, > + &kaddr, NULL); > + if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) { No need for the inner braces. > + flags |= DAX_RECOVERY; > + map_len = dax_direct_access(dax_dev, pgoff, nrpg, > + flags, &kaddr, NULL); And noneed for the flags variable at all really. > xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr, > map_len, iter); > else > @@ -1271,6 +1286,11 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi, > length -= xfer; > done += xfer; > > + if (recov && (xfer == (ssize_t) -EIO)) { > + pr_warn("dax_recovery_write failed\n"); > + ret = -EIO; > + break; And no, we can't just use an unsigned variable to communicate a negative error code.