> Am 23.12.2020 um 13:12 schrieb Liang Li <liliang324@xxxxxxxxx>: > > On Wed, Dec 23, 2020 at 4:41 PM David Hildenbrand <david@xxxxxxxxxx> wrote: >> >> [...] >> >>>> I was rather saying that for security it's of little use IMHO. >>>> Application/VM start up time might be improved by using huge pages (and >>>> pre-zeroing these). Free page reporting might be improved by using >>>> MADV_FREE instead of MADV_DONTNEED in the hypervisor. >>>> >>>>> this feature, above all of them, which one is likely to become the >>>>> most strong one? From the implementation, you will find it is >>>>> configurable, users don't want to use it can turn it off. This is not >>>>> an option? >>>> >>>> Well, we have to maintain the feature and sacrifice a page flag. For >>>> example, do we expect someone explicitly enabling the feature just to >>>> speed up startup time of an app that consumes a lot of memory? I highly >>>> doubt it. >>> >>> In our production environment, there are three main applications have such >>> requirement, one is QEMU [creating a VM with SR-IOV passthrough device], >>> anther other two are DPDK related applications, DPDK OVS and SPDK vhost, >>> for best performance, they populate memory when starting up. For SPDK vhost, >>> we make use of the VHOST_USER_GET/SET_INFLIGHT_FD feature for >>> vhost 'live' upgrade, which is done by killing the old process and >>> starting a new >>> one with the new binary. In this case, we want the new process started as quick >>> as possible to shorten the service downtime. We really enable this feature >>> to speed up startup time for them :) Am I wrong or does using hugeltbfs/tmpfs ... i.e., a file not-deleted between shutting down the old instances and firing up the new instance just solve this issue? >> >> Thanks for info on the use case! >> >> All of these use cases either already use, or could use, huge pages >> IMHO. It's not your ordinary proprietary gaming app :) This is where >> pre-zeroing of huge pages could already help. > > You are welcome. For some historical reason, some of our services are > not using hugetlbfs, that is why I didn't start with hugetlbfs. > >> Just wondering, wouldn't it be possible to use tmpfs/hugetlbfs ... >> creating a file and pre-zeroing it from another process, or am I missing >> something important? At least for QEMU this should work AFAIK, where you >> can just pass the file to be use using memory-backend-file. >> > If using another process to create a file, we can offload the overhead to > another process, and there is no need to pre-zeroing it's content, just > populating the memory is enough. Right, if non-zero memory can be tolerated (e.g., for vms usually has to). > If we do it that way, then how to determine the size of the file? it depends > on the RAM size of the VM the customer buys. > Maybe we can create a file > large enough in advance and truncate it to the right size just before the > VM is created. Then, how many large files should be created on a host? That‘s mostly already existing scheduling logic, no? (How many vms can I put onto a specific machine eventually) > You will find there are a lot of things that have to be handled properly. > I think it's possible to make it work well, but we will transfer the > management complexity to up layer components. It's a bad practice to let > upper layer components process such low level details which should be > handled in the OS layer. It‘s bad practice to squeeze things into the kernel that can just be handled on upper layers ;)