> -----Original Message----- > From: Dan Williams <dan.j.williams@xxxxxxxxx> > Subject: Re: [PATCH v4 03/10] fs: Introduce ->corrupted_range() for superblock > > [ drop old linux-nvdimm@xxxxxxxxxxxx, add nvdimm@xxxxxxxxxxxxxxx ] > > On Thu, Jun 3, 2021 at 6:19 PM Shiyang Ruan <ruansy.fnst@xxxxxxxxxxx> wrote: > > > > Memory failure occurs in fsdax mode will finally be handled in > > filesystem. We introduce this interface to find out files or metadata > > affected by the corrupted range, and try to recover the corrupted data > > if possiable. > > > > Signed-off-by: Shiyang Ruan <ruansy.fnst@xxxxxxxxxxx> > > --- > > include/linux/fs.h | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/include/linux/fs.h b/include/linux/fs.h index > > c3c88fdb9b2a..92af36c4225f 100644 > > --- a/include/linux/fs.h > > +++ b/include/linux/fs.h > > @@ -2176,6 +2176,8 @@ struct super_operations { > > struct shrink_control *); > > long (*free_cached_objects)(struct super_block *, > > struct shrink_control *); > > + int (*corrupted_range)(struct super_block *sb, struct block_device > *bdev, > > + loff_t offset, size_t len, void *data); > > Why does the superblock need a new operation? Wouldn't whatever function is > specified here just be specified to the dax_dev as the > ->notify_failure() holder callback? Because we need to find out which file is effected by the given poison page so that memory-failure code can do collect_procs() and kill_procs() jobs. And it needs filesystem to use its rmap feature to search the file from a given offset. So, we need this implemented by the specified filesystem and called by dax_device's holder. This is the call trace I described in cover letter: memory_failure() * fsdax case pgmap->ops->memory_failure() => pmem_pgmap_memory_failure() dax_device->holder_ops->corrupted_range() => - fs_dax_corrupted_range() - md_dax_corrupted_range() sb->s_ops->currupted_range() => xfs_fs_corrupted_range() <== **HERE** xfs_rmap_query_range() xfs_currupt_helper() * corrupted on metadata try to recover data, call xfs_force_shutdown() * corrupted on file data try to recover data, call mf_dax_kill_procs() * normal case mf_generic_kill_procs() As you can see, this new added operation is an important for the whole progress. -- Thanks, Ruan.