Re: [PATCH 1/2] shmem: Support for registration of driver/file owner specific ops

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 11/10/2016 11:06 AM, Hugh Dickins wrote:
On Fri, 4 Nov 2016, akash.goel@xxxxxxxxx wrote:
From: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>

This provides support for the drivers or shmem file owners to register
a set of callbacks, which can be invoked from the address space
operations methods implemented by shmem.  This allow the file owners to
hook into the shmem address space operations to do some extra/custom
operations in addition to the default ones.

The private_data field of address_space struct is used to store the
pointer to driver specific ops.  Currently only one ops field is defined,
which is migratepage, but can be extended on an as-needed basis.

The need for driver specific operations arises since some of the
operations (like migratepage) may not be handled completely within shmem,
so as to be effective, and would need some driver specific handling also.
Specifically, i915.ko would like to participate in migratepage().
i915.ko uses shmemfs to provide swappable backing storage for its user
objects, but when those objects are in use by the GPU it must pin the
entire object until the GPU is idle.  As a result, large chunks of memory
can be arbitrarily withdrawn from page migration, resulting in premature
out-of-memory due to fragmentation.  However, if i915.ko can receive the
migratepage() request, it can then flush the object from the GPU, remove
its pin and thus enable the migration.

Since gfx allocations are one of the major consumer of system memory, its
imperative to have such a mechanism to effectively deal with
fragmentation.  And therefore the need for such a provision for initiating
driver specific actions during address space operations.

Thank you for persisting with this, and sorry for all my delay.


v2:
- Drop dev_ prefix from the members of shmem_dev_info structure. (Joonas)
- Change the return type of shmem_set_device_op() to void and remove the
  check for pre-existing data. (Joonas)
- Rename shmem_set_device_op() to shmem_set_dev_info() to be consistent
  with shmem_dev_info structure. (Joonas)

Cc: Hugh Dickins <hughd@xxxxxxxxxx>
Cc: linux-mm@xxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxx
Signed-off-by: Sourab Gupta <sourab.gupta@xxxxxxxxx>
Signed-off-by: Akash Goel <akash.goel@xxxxxxxxx>
Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>

That doesn't seem quite right: the From line above implies that Chris
wrote it, and should be first Signer; but perhaps the From line is wrong.

Chris only wrote this patch initially, will do the required correction.

---
 include/linux/shmem_fs.h | 13 +++++++++++++
 mm/shmem.c               | 17 ++++++++++++++++-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index ff078e7..454c3ba 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -39,11 +39,24 @@ struct shmem_sb_info {
 	unsigned long shrinklist_len; /* Length of shrinklist */
 };

+struct shmem_dev_info {
+	void *private_data;
+	int (*migratepage)(struct address_space *mapping,
+			   struct page *newpage, struct page *page,
+			   enum migrate_mode mode, void *dev_priv_data);

Aren't the private_data field and dev_priv_data arg a little bit
confusing and redundant?  Can't the migratepage() deduce dev_priv
for itself from mapping->private_data (perhaps wrapped by a
shmem_get_dev_info()), by using container_of()?

Yes looks like migratepage() can deduce dev_priv from mapping->private_data.
Can we keep the private_data as a placeholder ?. Will s/dev_priv_data/private_data/.

As per your suggestion, in the other patch, object pointer can be derived from mapping->private_data (container_of) and dev_priv in turn can be derived from object pointer.

+};
+
 static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
 {
 	return container_of(inode, struct shmem_inode_info, vfs_inode);
 }

+static inline void shmem_set_dev_info(struct address_space *mapping,
+				      struct shmem_dev_info *info)
+{
+	mapping->private_data = info;

Nit: if this stays as is, I'd prefer dev_info there and above,
since shmem.c uses info all over for its shmem_inode_info pointer.
But in second patch I suggest obj_info may be better than dev_info.

Fine will s/info/dev_info.

Best regards
Akash

+}
+
 /*
  * Functions in mm/shmem.c called directly from elsewhere:
  */
diff --git a/mm/shmem.c b/mm/shmem.c
index ad7813d..fce8de3 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1290,6 +1290,21 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
 	return 0;
 }

+#ifdef CONFIG_MIGRATION
+static int shmem_migratepage(struct address_space *mapping,
+			     struct page *newpage, struct page *page,
+			     enum migrate_mode mode)
+{
+	struct shmem_dev_info *dev_info = mapping->private_data;
+
+	if (dev_info && dev_info->migratepage)
+		return dev_info->migratepage(mapping, newpage, page,
+					     mode, dev_info->private_data);
+
+	return migrate_page(mapping, newpage, page, mode);
+}
+#endif
+
 #if defined(CONFIG_NUMA) && defined(CONFIG_TMPFS)
 static void shmem_show_mpol(struct seq_file *seq, struct mempolicy *mpol)
 {
@@ -3654,7 +3669,7 @@ static void shmem_destroy_inodecache(void)
 	.write_end	= shmem_write_end,
 #endif
 #ifdef CONFIG_MIGRATION
-	.migratepage	= migrate_page,
+	.migratepage	= shmem_migratepage,
 #endif
 	.error_remove_page = generic_error_remove_page,
 };
--
1.9.2

I didn't like this very much; but every time I tried to "improve" it,
found good reasons why you chose the way you did (modularity of i915,
constness of a_ops, reluctance to copy and modify a_ops, reluctance
to export those shmem methods separately).

I think perhaps later we just add a gem_ops pointer to shmem_inode_info,
for i915 or other gems to fill in as they wish (and shmem divert off to
them if set, as you've done); but for now you're trying to avoid
enlarging the shmem inode, okay.

Hugh


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]