Re: [LSF/MM/BPF TOPIC] HGM for hugetlbfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/25/23 20:00, David Rientjes wrote:
> On Wed, 24 May 2023, James Houghton wrote:
> 
> > Hi everyone,
> > 
> > If you came to the HGM session at LSF/MM/BPF, thank you!
> 
> Thank you, James, for putting together such a detailed discussion and 
> soliciting some great feedback.
> 
> > I want to
> > address some of the feedback I got and restate the importance of HGM,
> > especially as it relates to handling memory poison.
> > 
> 
> Thanks for bringing this up, I think it's a very important use case.  
> Adding in Naoya Horiguchi and Miaohe Lin as well.
> 
> > ## Memory poison is a problem
> > 
> > HGM allows us to unmap poison at 4K instead of unmapping the entire
> > hugetlb page. For applications that use HugeTLB, losing the entire
> > hugepage can be catastrophic. For example, if a hypervisor is using 1G
> > pages for guest memory, the VM will lose 1G of its physical address
> > space, which is catastrophic (even 2M will most likely kill the VM).
> > If we can limit the poisoning to only 4K, the VM will most likely be
> > able to recover. This improved recoverability applies to other HugeTLB
> > users as well, like databases.
> > 
> 
> Mike, do you have feedback on how useful this would be, especially for use 
> cases beyond what cloud providers would find helpful?
> 

Sorry for the delay, I was out on holiday.

The benefit of HGM in the case of memory errors is fairly obvious.  As
mentioned above, when a memory error is encountered on a hugetlb page,
that entire hugetlb page becomes inaccessible to the application.  Losing,
1G or even 2M of data is often catastrophic for an application.  There
is often no way to recover.  It just makes sense that recovering from
the loss of 4K of data would generally be easier and more likely to be
possible.  Today, when Oracle DB encounters a hard memory error on a
hugetlb page it will shutdown.  Plans are currently in place repair and
recover from such errors if possible.  Isolating the area of data loss
to a single 4K page significantly increases the likelihood of repair and
recovery.

Today, when a memory error is encountered on a hugetlb page an
application is 'notified' of the error by a SIGBUS, as well as the
virtual address of the hugetlb page and it's size.  This makes sense as
hugetlb pages are accessed by a single page table entry, so you get all
or nothing.  As mentioned by James above, this is catastrophic for VMs
as the hypervisor has just been told that 2M or 1G is now inaccessible.
With HGM, we can isolate such errors to 4K.

Backing VMs with hugetlb pages is a real use case today.  We are seeing
memory errors on such hugetlb pages with the result being VM failures.
One of the advantages of backing VMs with THPs is that they are split in
the case of memory errors.  HGM would allow similar functionality.
-- 
Mike Kravetz




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux