On Mon, Aug 16, 2021 at 04:10:28PM +0200, David Hildenbrand wrote: > > > > Until recently, the CPUs only having 4 1GB TLB entries. I'm sure we > > > > still have customers using that generation of CPUs. 2MB pages perform > > > > better than 1GB pages on the previous generation of hardware, and I > > > > haven't seen numbers for the next generation yet. > > > > > > I read that somewhere else before, yet we have heavy 1 GiB page users, > > > especially in the context of VMs and DPDK. > > > > I wonder if those users actually benchmarked. Or whether the memory > > savings worked out so well for them that the loss of TLB performance > > didn't matter. > > These applications are extremely performance sensitive (i.e., RT workloads), "real time does not mean real fast". it means predictable latency. > > > I will rephrase my previous statement "hugetlbfs just doesn't raise these > > > problems because we are special casing it all over the place already". For > > > example, not allowing to swap such pages. Disallowing MADV_DONTNEED. Special > > > hugetlbfs locking. > > > > Sure, that's why I want to drag this feature out of "oh this is a > > hugetlb special case" and into "this is something Linux supports". > > I would have understood the move to optimize SHMEM internally - similar to > how we seem to optimize hugetlbfs SHMEM right now internally. (although > sharing page tables for shmem can still be quite tricky) > > I did not follow why we have to play games with MAP_PRIVATE, and having > private anonymous pages shared between processes that don't COW, introducing > new syscalls etc. It's not about SHMEM, it's about file-backed pages on regular filesystems. I don't want to have XFS, ext4 and btrfs all with their own implementations of ARCH_WANT_HUGE_PMD_SHARE.