Re: [PATCH 2/3] mm/hugetlb: Setup hugetlb_falloc during fallocate hole punch

Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> · Mon, 19 Oct 2015 16:16:42 -0700

On Fri, 16 Oct 2015 15:08:29 -0700 Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote:

> When performing a fallocate hole punch, set up a hugetlb_falloc struct
> and make i_private point to it.  i_private will point to this struct for
> the duration of the operation.  At the end of the operation, wake up
> anyone who faulted on the hole and is on the waitq.
> 
> ...
>
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -507,7 +507,9 @@ static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
>  {
>  	struct hstate *h = hstate_inode(inode);
>  	loff_t hpage_size = huge_page_size(h);
> +	unsigned long hpage_shift = huge_page_shift(h);
>  	loff_t hole_start, hole_end;
> +	struct hugetlb_falloc hugetlb_falloc;
>  
>  	/*
>  	 * For hole punch round up the beginning offset of the hole and
> @@ -518,8 +520,23 @@ static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
>  
>  	if (hole_end > hole_start) {
>  		struct address_space *mapping = inode->i_mapping;
> +		DECLARE_WAIT_QUEUE_HEAD_ONSTACK(hugetlb_falloc_waitq);
> +
> +		/*
> +		 * Page faults on the area to be hole punched must be
> +		 * stopped during the operation.  Initialize struct and
> +		 * have inode->i_private point to it.
> +		 */
> +		hugetlb_falloc.waitq = &hugetlb_falloc_waitq;
> +		hugetlb_falloc.start = hole_start >> hpage_shift;
> +		hugetlb_falloc.end = hole_end >> hpage_shift;

This is a bit neater:

--- a/fs/hugetlbfs/inode.c~mm-hugetlb-setup-hugetlb_falloc-during-fallocate-hole-punch-fix
+++ a/fs/hugetlbfs/inode.c
@@ -509,7 +509,6 @@ static long hugetlbfs_punch_hole(struct
 	loff_t hpage_size = huge_page_size(h);
 	unsigned long hpage_shift = huge_page_shift(h);
 	loff_t hole_start, hole_end;
-	struct hugetlb_falloc hugetlb_falloc;
 
 	/*
 	 * For hole punch round up the beginning offset of the hole and
@@ -521,15 +520,16 @@ static long hugetlbfs_punch_hole(struct
 	if (hole_end > hole_start) {
 		struct address_space *mapping = inode->i_mapping;
 		DECLARE_WAIT_QUEUE_HEAD_ONSTACK(hugetlb_falloc_waitq);
-
 		/*
-		 * Page faults on the area to be hole punched must be
-		 * stopped during the operation.  Initialize struct and
-		 * have inode->i_private point to it.
+		 * Page faults on the area to be hole punched must be stopped
+		 * during the operation.  Initialize struct and have
+		 * inode->i_private point to it.
 		 */
-		hugetlb_falloc.waitq = &hugetlb_falloc_waitq;
-		hugetlb_falloc.start = hole_start >> hpage_shift;
-		hugetlb_falloc.end = hole_end >> hpage_shift;
+		struct hugetlb_falloc hugetlb_falloc = {
+			.waitq = &hugetlb_falloc_waitq,
+			.start = hole_start >> hpage_shift,
+			.end = hole_end >> hpage_shift
+		};
 
 		mutex_lock(&inode->i_mutex);
 

>  		mutex_lock(&inode->i_mutex);
> +
> +		spin_lock(&inode->i_lock);
> +		inode->i_private = &hugetlb_falloc;
> +		spin_unlock(&inode->i_lock);

Locking around a single atomic assignment is a bit peculiar.  I can
kinda see that it kinda protects the logic in hugetlb_fault(), but I
would like to hear (in comment form) your description of how this logic
works?

>  		i_mmap_lock_write(mapping);
>  		if (!RB_EMPTY_ROOT(&mapping->i_mmap))
>  			hugetlb_vmdelete_list(&mapping->i_mmap,

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>