在 2022/3/30 13:41, Christoph Hellwig 写道:
On Wed, Mar 16, 2022 at 09:46:07PM +0800, Shiyang Ruan wrote:
Forgive me if this has been discussed before, but since dax_operations
are in terms of pgoff and nr pages and memory_failure() is in terms of
pfns what was the rationale for making the function signature byte
based?
Maybe I didn't describe it clearly... The @offset and @len here are
byte-based. And so is ->memory_failure().
Yes, but is there a good reason for that when the rest of the DAX code
tends to work in page chunks?
Because I am not sure if the offset between each layer is page aligned.
For example, when pmem dirver handles ->memory_failure(), it should
subtract its ->data_offset when it calls dax_holder_notify_failure().
The implementation of ->memory_failure() by pmem driver:
+static int pmem_pagemap_memory_failure(struct dev_pagemap *pgmap,
+ phys_addr_t addr, u64 len, int mf_flags)
+{
+ struct pmem_device *pmem =
+ container_of(pgmap, struct pmem_device, pgmap);
+ u64 offset = addr - pmem->phys_addr - pmem->data_offset;
+
+ return dax_holder_notify_failure(pmem->dax_dev, offset, len, mf_flags);
+}
So, I choose u64 as the type of @len. And for consistency, the @addr is
using byte-based type as well.
> memory_failure()
> |* fsdax case
> |------------
> |pgmap->ops->memory_failure() => pmem_pgmap_memory_failure()
> | dax_holder_notify_failure() =>
the offset from 'pmem driver' to 'dax holder'
> | dax_device->holder_ops->notify_failure() =>
> | - xfs_dax_notify_failure()
> | |* xfs_dax_notify_failure()
> | |--------------------------
> | | xfs_rmap_query_range()
> | | xfs_dax_failure_fn()
> | | * corrupted on metadata
> | | try to recover data, call xfs_force_shutdown()
> | | * corrupted on file data
> | | try to recover data, call mf_dax_kill_procs()
> |* normal case
> |-------------
> |mf_generic_kill_procs()
--
Thanks,
Ruan.