On Fri, Jul 28, 2023 at 07:14:26PM -0300, Jason Gunthorpe wrote: > On Fri, Jul 28, 2023 at 11:31:49PM +0200, David Hildenbrand wrote: > > * vfio triggers FOLL_PIN|FOLL_LONGTERM from a random QEMU thread. > > Where should we migrate that page to? Would it actually be counter- > > productive to migrate it to the NUMA node of the setup thread? The > > longterm pin will turn the page unmovable, yes, but where to migrate > > it to? > > For VFIO & KVM you actively don't get any kind of numa balancing or > awareness. In this case qemu should probably strive to put the memory > on the numa node of the majorty of CPUs early on because it doesn't > get another shot at it. > > In other cases it depends quite alot. Eg DPDK might want its VFIO > buffers to NUMA'd to the node that is close to the device, not the > CPU. Or vice versa. There is alot of micro sensitivity here at high > data rates. I think people today manually tune this by deliberately > allocating the memory to specific numas and then GUP should just leave > it alone. Right. For the other O_DIRECT example - it seems to be a more generic issue to "whether we should rely on the follow up accessor to decide the target node of numa balancing". To me at least for KVM's use case I'd still expect the major paths to trigger that is still when guest accessing a page from vcpu threads, that's still the GUP paths. Thanks, -- Peter Xu