On Tue, Feb 27, 2024 at 9:59 AM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: > > > privately-managed pages into a sparse vm area with the following steps: > > > > area = get_vm_area(area_size, VM_SPARSE); // at bpf prog verification time > > vm_area_map_pages(area, kaddr, 1, page); // on demand > > // it will return an error if kaddr is out of range > > vm_area_unmap_pages(area, kaddr, 1); > > free_vm_area(area); // after bpf prog is unloaded > > I'm still wondering if this should just use an opaque cookie instead > of exposing the vm_area. But otherwise this mostly looks fine to me. What would it look like with a cookie? A static inline wrapper around get_vm_area() that returns area->addr ? And the start address of vmap range will be such a cookie? Then vm_area_map_pages() will be doing find_vm_area() for kaddr to check that vm_area->flag & VM_SPARSE ? That's fine, but what would be an equivalent of void free_vm_area(struct vm_struct *area) ? Another static inline wrapper similar to remove_vm_area() that also does kfree(area); ? Fine by me, but api isn't user friendly with such obfuscation. I guess I don't understand the motivation to hide 'struct vm_struct *'. > > + if (addr < (unsigned long)area->addr || (void *)end > area->addr + area->size) > > + return -ERANGE; > > This check is duplicated so many times that it really begs for a helper. ok. will do. > > +int vm_area_unmap_pages(struct vm_struct *area, unsigned long addr, unsigned int count) > > +{ > > + unsigned long size = ((unsigned long)count) * PAGE_SIZE; > > + unsigned long end = addr + size; > > + > > + if (WARN_ON_ONCE(!(area->flags & VM_SPARSE))) > > + return -EINVAL; > > + if (addr < (unsigned long)area->addr || (void *)end > area->addr + area->size) > > + return -ERANGE; > > + > > + vunmap_range(addr, end); > > + return 0; > > Does it make much sense to have an error return here vs just debug > checks? It's not like the caller can do much if it violates these > basic invariants. Ok. Will switch to void return. Will reduce commit line logs to 75 chars in all patches as suggested. re: VM_GRANT_TABLE or VM_XEN_GRANT_TABLE suggestion for patch 2. I'm not sure it fits, since only one of get_vm_area() in xen code is a grant table related. The other one is for xenbus that creates a shared memory ring between domains. So I'm planning to keep it as VM_XEN in the next revision unless folks come up with a better name. Thanks for the reviews.