On 10.01.25 07:00, Alistair Popple wrote:
Zone device pages are used to represent various type of device memory
managed by device drivers. Currently compound zone device pages are
not supported. This is because MEMORY_DEVICE_FS_DAX pages are the only
user of higher order zone device pages and have their own page
reference counting.
A future change will unify FS DAX reference counting with normal page
reference counting rules and remove the special FS DAX reference
counting. Supporting that requires compound zone device pages.
Supporting compound zone device pages requires compound_head() to
distinguish between head and tail pages whilst still preserving the
special struct page fields that are specific to zone device pages.
A tail page is distinguished by having bit zero being set in
page->compound_head, with the remaining bits pointing to the head
page. For zone device pages page->compound_head is shared with
page->pgmap.
The page->pgmap field is common to all pages within a memory section.
Therefore pgmap is the same for both head and tail pages and can be
moved into the folio and we can use the standard scheme to find
compound_head from a tail page.
The more relevant thing is that the pgmap field must be common to all
pages in a folio, even if a folio exceeds memory sections (e.g., 128 MiB
on x86_64 where we have 1 GiB folios).
> > Signed-off-by: Alistair Popple <apopple@xxxxxxxxxx>
Reviewed-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
Reviewed-by: Dan Williams <dan.j.williams@xxxxxxxxx>
---
Changes for v4:
- Fix build breakages reported by kernel test robot
Changes since v2:
- Indentation fix
- Rename page_dev_pagemap() to page_pgmap()
- Rename folio _unused field to _unused_pgmap_compound_head
- s/WARN_ON/VM_WARN_ON_ONCE_PAGE/
Changes since v1:
- Move pgmap to the folio as suggested by Matthew Wilcox
---
[...]
static inline bool folio_is_device_coherent(const struct folio *folio)
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 29919fa..61899ec 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -205,8 +205,8 @@ struct migrate_vma {
unsigned long end;
/*
- * Set to the owner value also stored in page->pgmap->owner for
- * migrating out of device private memory. The flags also need to
+ * Set to the owner value also stored in page_pgmap(page)->owner
+ * for migrating out of device private memory. The flags also need to
* be set to MIGRATE_VMA_SELECT_DEVICE_PRIVATE.
* The caller should always set this field when using mmu notifier
* callbacks to avoid device MMU invalidations for device private
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index df8f515..54b59b8 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -129,8 +129,11 @@ struct page {
unsigned long compound_head; /* Bit zero is set */
};
struct { /* ZONE_DEVICE pages */
- /** @pgmap: Points to the hosting device page map. */
- struct dev_pagemap *pgmap;
+ /*
+ * The first word is used for compound_head or folio
+ * pgmap
+ */
+ void *_unused_pgmap_compound_head;
void *zone_device_data;
/*
* ZONE_DEVICE private pages are counted as being
@@ -299,6 +302,7 @@ typedef struct {
* @_refcount: Do not access this member directly. Use folio_ref_count()
* to find how many references there are to this folio.
* @memcg_data: Memory Control Group data.
+ * @pgmap: Metadata for ZONE_DEVICE mappings
* @virtual: Virtual address in the kernel direct map.
* @_last_cpupid: IDs of last CPU and last process that accessed the folio.
* @_entire_mapcount: Do not use directly, call folio_entire_mapcount().
@@ -337,6 +341,7 @@ struct folio {
/* private: */
};
/* public: */
+ struct dev_pagemap *pgmap;
Agreed, that should work.
Acked-by: David Hildenbrand <david@xxxxxxxxxx>
--
Cheers,
David / dhildenb