Re: FAILED: patch "[PATCH] hugetlbfs: fix races and page leaks during migration" failed to apply to 4.14-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 04, 2019 at 03:51:31PM -0800, Mike Kravetz wrote:
> On 3/2/19 12:12 AM, gregkh@xxxxxxxxxxxxxxxxxxx wrote:
> > 
> > The patch below does not apply to the 4.14-stable tree.
> > If someone wants it applied there, or to any other stable or longterm
> > tree, then please email the backport, including the original git commit
> > id to <stable@xxxxxxxxxxxxxxx>.
> 
> From: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
> Date: Mon, 4 Mar 2019 15:36:59 -0800
> Subject: [PATCH] hugetlbfs: fix races and page leaks during migration
> 
> commit cb6acd01e2e43fd8bad11155752b7699c3d0fb76 upstream.
> 
> hugetlb pages should only be migrated if they are 'active'.  The routines
> set/clear_page_huge_active() modify the active state of hugetlb pages.
> When a new hugetlb page is allocated at fault time, set_page_huge_active
> is called before the page is locked.  Therefore, another thread could
> race and migrate the page while it is being added to page table by the
> fault code.  This race is somewhat hard to trigger, but can be seen by
> strategically adding udelay to simulate worst case scheduling behavior.
> Depending on 'how' the code races, various BUG()s could be triggered.
> 
> To address this issue, simply delay the set_page_huge_active call until
> after the page is successfully added to the page table.
> 
> Hugetlb pages can also be leaked at migration time if the pages are
> associated with a file in an explicitly mounted hugetlbfs filesystem.
> For example, consider a two node system with 4GB worth of huge pages
> available.  A program mmaps a 2G file in a hugetlbfs filesystem.  It
> then migrates the pages associated with the file from one node to
> another.  When the program exits, huge page counts are as follows:
> 
> node0
> 1024    free_hugepages
> 1024    nr_hugepages
> 
> node1
> 0       free_hugepages
> 1024    nr_hugepages
> 
> Filesystem                         Size  Used Avail Use% Mounted on
> nodev                              4.0G  2.0G  2.0G  50% /var/opt/hugepool
> 
> That is as expected.  2G of huge pages are taken from the free_hugepages
> counts, and 2G is the size of the file in the explicitly mounted
> filesystem.  If the file is then removed, the counts become:
> 
> node0
> 1024    free_hugepages
> 1024    nr_hugepages
> 
> node1
> 1024    free_hugepages
> 1024    nr_hugepages
> 
> Filesystem                         Size  Used Avail Use% Mounted on
> nodev                              4.0G  2.0G  2.0G  50% /var/opt/hugepool
> 
> Note that the filesystem still shows 2G of pages used, while there
> actually are no huge pages in use.  The only way to 'fix' the
> filesystem accounting is to unmount the filesystem
> 
> If a hugetlb page is associated with an explicitly mounted filesystem,
> this information in contained in the page_private field.  At migration
> time, this information is not preserved.  To fix, simply transfer
> page_private from old to new page at migration time if necessary.
> 
> There is a related race with removing a huge page from a file and
> migration.  When a huge page is removed from the pagecache, the
> page_mapping() field is cleared, yet page_private remains set until the
> page is actually freed by free_huge_page().  A page could be migrated
> while in this state.  However, since page_mapping() is not set the
> hugetlbfs specific routine to transfer page_private is not called and
> we leak the page count in the filesystem.  To fix, check for this
> condition before migrating a huge page.  If the condition is detected,
> return EBUSY for the page.
> 
> Cc: <stable@xxxxxxxxxxxxxxx>
> Fixes: bcc54222309c ("mm: hugetlb: introduce page_huge_active")
> Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
> ---
>  fs/hugetlbfs/inode.c | 12 ++++++++++++
>  mm/hugetlb.c         | 16 +++++++++++++---
>  mm/migrate.c         | 11 +++++++++++
>  3 files changed, 36 insertions(+), 3 deletions(-)

Thanks for all 4 of these, now queued up.

greg k-h



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux