Re: [Lsf-pc] [LSF/MM ATTEND] Huge Page Futures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 27, 2016 at 09:49:57AM -0800, Mike Kravetz wrote:
> On 01/25/2016 05:50 AM, Mike Kravetz wrote:
> >> Do you have any thoughts how it's going to be implemented? It would be
> >> nice to have some design overview or better proof-of-concept patch before
> >> the summit to be able analyze implications for the kernel.
> >>
> > 
> > Good to know the hugetlbfs implementation is considered a hack.  I just
> > started looking at this, and was going to use hugetlbfs as a starting
> > point.  I'll reconsider that decision.
> 
> Kirill, can you (or others) explain your reasons for saying the hugetlbfs
> implementation is an ugly hack?  I do not have enough history/experience
> with this to say what is most offensive.  I would be happy to start by
> cleaning up issues with the current implementation.
> 

Historically, it was considered a hack because it had special handling in
a number of paths in the VM. Of course THP also has similar handling now
so it's less of a concern but there are differences that cause base pages,
transparent hugepages and hugetlbfs pages to all be special cases. That
does not sit comfortably with everyone.

For a long time, it was considered ugly because a fault on private child
mappings was so unreliable and a fork could cause a parent to unexpectedly
fail a fault and die. These days it's different as only the child can die
so while it's less of a concern, hugetlbfs pages allow a child to be killed
if enough huge pages are not available.

It was also considered ugly because application-awareness was required in
so many cases. Granted, libhugetlbfs can hide some of that ugliness but
even that was considered hacky.

The fact that hugetlbfs pages cannot be swapped even without mlock is
another fact that makes them different to the rest of the VM. It has its
own reservation scheme that is different to everything else.

One that crippled it to some extent with the label was the fact that fixing
swap on it was effectively impossible because of power. Once huge pages
had been installed on that architecture for a lont time, it was impossible
to remap them at a different size. The limitation has been relaxed to some
extent but those around long enough remember it.

So it is a bit of a hack that behaves differently to other page types.
It's fairly complex and while the semantics used to be a lot uglier than
it is now, the "ugly hack" label has stuck.

> If we do shared page tables for DAX, it makes sense that it and hugetlbfs
> should be similar (or common) if possible.
> 

It's been a long time since I looked at shared page tables so I can't
remember why but it was a difficult area. A few years were spent on it so
if shared page tables are being considered, I would make damn sure first
that they actually help on modern hardware before jumping into that hole.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux