On 3/4/2025 10:29 PM, Sean Christopherson wrote: > On Tue, Mar 04, 2025, David Hildenbrand wrote: >> On 04.03.25 16:30, Sean Christopherson wrote: >>> On Tue, Mar 04, 2025, Ackerley Tng wrote: >>>> Vlastimil Babka <vbabka@xxxxxxx> writes: >>>>>> struct shared_policy should be stored on the inode rather than the file, >>>>>> since the memory policy is a property of the memory (struct inode), >>>>>> rather than a property of how the memory is used for a given VM (struct >>>>>> file). >>>>> >>>>> That makes sense. AFAICS shmem also uses inodes to store policy. >>>>> >>>>>> When the shared_policy is stored on the inode, intra-host migration [1] >>>>>> will work correctly, since the while the inode will be transferred from >>>>>> one VM (struct kvm) to another, the file (a VM's view/bindings of the >>>>>> memory) will be recreated for the new VM. >>>>>> >>>>>> I'm thinking of having a patch like this [2] to introduce inodes. >>>>> >>>>> shmem has it easier by already having inodes >>>>> >>>>>> With this, we shouldn't need to pass file pointers instead of inode >>>>>> pointers. >>>>> >>>>> Any downsides, besides more work needed? Or is it feasible to do it using >>>>> files now and convert to inodes later? >>>>> >>>>> Feels like something that must have been discussed already, but I don't >>>>> recall specifics. >>>> >>>> Here's where Sean described file vs inode: "The inode is effectively the >>>> raw underlying physical storage, while the file is the VM's view of that >>>> storage." [1]. >>>> >>>> I guess you're right that for now there is little distinction between >>>> file and inode and using file should be feasible, but I feel that this >>>> dilutes the original intent. >>> >>> Hmm, and using the file would be actively problematic at some point. One could >>> argue that NUMA policy is property of the VM accessing the memory, i.e. that two >>> VMs mapping the same guest_memfd could want different policies. But in practice, >>> that would allow for conflicting requirements, e.g. different policies in each >>> VM for the same chunk of memory, and would likely lead to surprising behavior due >>> to having to manually do mbind() for every VM/file view. >> >> I think that's the same behavior with shmem? I mean, if you have two people >> asking for different things for the same MAP_SHARE file range, surprises are >> unavoidable. > > Yeah, I was specifically thinking of the case where a secondary mapping doesn't > do mbind() at all, e.g. could end up effectively polluting guest_memfd with "bad" > allocations. Thank you for the feedback. I agree that storing the policy in the inode is the correct approach, as it aligns with shmem's behavior. I now understand that keeping the policy in file-private data could lead to surprising behavior, especially with multiple VMs mapping the same guest_memfd. The inode-based approach also makes sense from a long-term perspective, especially with upcoming restricted mapping support. I'll pick the Ackerley's patch[1] to add support for gmem inodes. With this patch, it does not seem overly complex to implement to policy storage in inodes. I'll test this approach and submit a revised patch shortly. [1] https://lore.kernel.org/all/d1940d466fc69472c8b6dda95df2e0522b2d8744.1726009989.git.ackerleytng@xxxxxxxxxx/ Thanks, Shivank