On Thu, Aug 15, 2019 at 04:16:30PM -0400, Jerome Glisse wrote: > On Thu, Aug 15, 2019 at 03:19:29PM -0400, Jerome Glisse wrote: > > On Tue, Aug 13, 2019 at 02:01:35PM +0300, Adalbert Lazăr wrote: > > > On Fri, 9 Aug 2019 09:24:44 -0700, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > On Fri, Aug 09, 2019 at 07:00:26PM +0300, Adalbert Lazăr wrote: > > > > > +++ b/include/linux/page-flags.h > > > > > @@ -417,8 +417,10 @@ PAGEFLAG(Idle, idle, PF_ANY) > > > > > */ > > > > > #define PAGE_MAPPING_ANON 0x1 > > > > > #define PAGE_MAPPING_MOVABLE 0x2 > > > > > +#define PAGE_MAPPING_REMOTE 0x4 > > > > > > > > Uh. How do you know page->mapping would otherwise have bit 2 clear? > > > > Who's guaranteeing that? > > > > > > > > This is an awfully big patch to the memory management code, buried in > > > > the middle of a gigantic series which almost guarantees nobody would > > > > look at it. I call shenanigans. > > > > > > > > > @@ -1021,7 +1022,7 @@ void page_move_anon_rmap(struct page *page, struct vm_area_struct *vma) > > > > > * __page_set_anon_rmap - set up new anonymous rmap > > > > > * @page: Page or Hugepage to add to rmap > > > > > * @vma: VM area to add page to. > > > > > - * @address: User virtual address of the mapping > > > > > + * @address: User virtual address of the mapping > > > > > > > > And mixing in fluff changes like this is a real no-no. Try again. > > > > > > > > > > No bad intentions, just overzealous. > > > I didn't want to hide anything from our patches. > > > Once we advance with the introspection patches related to KVM we'll be > > > back with the remote mapping patch, split and cleaned. > > > > They are not bit left in struct page ! Looking at the patch it seems > > you want to have your own pin count just for KVM. This is bad, we are > > already trying to solve the GUP thing (see all various patchset about > > GUP posted recently). > > > > You need to rethink how you want to achieve this. Why not simply a > > remote read()/write() into the process memory ie KVMI would call > > an ioctl that allow to read or write into a remote process memory > > like ptrace() but on steroid ... > > > > Adding this whole big complex infrastructure without justification > > of why we need to avoid round trip is just too much really. > > Thinking a bit more about this, you can achieve the same thing without > adding a single line to any mm code. Instead of having mmap with > PROT_NONE | MAP_LOCKED you have userspace mmap some kvm device file > (i am assuming this is something you already have and can control > the mmap callback). > > So now kernel side you have a vma with a vm_operations_struct under > your control this means that everything you want to block mm wise > from within the inspector process can be block through those call- > backs (find_special_page() specificaly for which you have to return > NULL all the time). I'm actually aware of a couple of use cases that would like to mirror the VA of one process into another. One big one in the HPC world is the out of tree 'xpmem' still in wide use today. xpmem is basically what Jerome described above. If you do an approach like Jerome describes it would be nice if was a general facility and not buried in kvm. I know past xpmem adventures ran into trouble with locking/etc - ie getting the mm_struct of the victim seemed a bit hard for some reason, but maybe that could be done with a FD pass 'ioctl(I AM THE VICITM)' ? Jason