Re: [LSF/MM/BPF TOPIC] BoF VM live migration over CXL memory​

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12.04.23 17:26, Gregory Price wrote:
On Wed, Apr 12, 2023 at 10:38:04AM +0200, David Hildenbrand wrote:
On 12.04.23 04:54, Huang, Ying wrote:
Gregory Price <gregory.price@xxxxxxxxxxxx> writes:

On Tue, Apr 11, 2023 at 02:37:50PM +0800, Huang, Ying wrote:
Gregory Price <gregory.price@xxxxxxxxxxxx> writes:

[snip]

2. During the migration process, the memory needs to be forced not to be
     migrated to another node by other means (tiering software, swap,
     etc).  The obvious way of doing this would be to migrate and
     temporarily pin the page... but going back to problem #1 we see that
     ZONE_MOVABLE and Pinning are mutually exclusive.  So that's
     troublesome.

Can we use memory policy (cpusets, mbind(), set_mempolicy(), etc.) to
avoid move pages out of CXL.mem node?  Now, there are gaps in tiering,
but I think it is fixable.

Best Regards,
Huang, Ying

[snip]

That feels like a hack/bodge rather than a proper solution to me.

Maybe this is an affirmative argument for the creation of an EXMEM
zone.

Let's start with requirements.  What is the requirements for a new zone
type?

I'm stills scratching my head regarding this. I keep hearing all different
kind of statements that just add more confusions "we want it to be
hotunpluggable" "we want to allow for long-term pinning memory" "but we
still want it to be movable" "we want to place some unmovable allocations on
it". Huh?

Just to clarify: ZONE_MOVABLE allows for pinning. It just doesn't allow for
long-term pinning of memory.


I apologize for the confusion, this is my fault.  I had assumed that
since dax regions can't be pinned, subsequent nodes backed by a dax
device could not be pinned.  In testing this, this is not the case.

Re: long-term pinning, can you be more explicit as to what is considered
long-term?  Minutes? hours? days? etc

long-term: possibly forever, controlled by user space. In practice, anything longer than ~10 seconds ( best guess :) ). There can be long-term pinnings that are of very short duration, we just don't know what user space is up to and when it will decide to unpin.

Assume user space requests to trigger read/write of a user space page to a file: the page is pinned, DMA is started, once DMA completes the page is unpinned. Short-term. User space does not control how long the page remains pinned.

In contrast:

Example #1: mapping VM guest memory into an IOMMU using vfio for PCI passthrough requires pinning the pages. Until user space decides to unmap the pages from the IOMMU, the pages will remain pinned. -> long-term

Example #2: mapping a user space address range into an IOMMU to repeatedly perform RDMA using that address range requires pinning the pages. Until user space decides to unregister that range, the pages remain pinned. -> long-term

Example #3: registering a user space address range with io_uring as a fixed buffer, such that io_uring OPS can avoid the page table walks by simply using the pinned pages that were looked up once. As long as the fixed buffer remains registered, the pages stay pinned. -> long-term

--
Thanks,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux