On Tue, Jun 25, 2024 at 7:42 PM Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> wrote: > > * Andrii Nakryiko <andrii@xxxxxxxxxx> [240618 18:45]: > ... > > > + > > +static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg) > > +{ > > + struct procmap_query karg; > > + struct vm_area_struct *vma; > > + struct mm_struct *mm; > > + const char *name = NULL; > > + char *name_buf = NULL; > > + __u64 usize; > > + int err; > > + > > + if (copy_from_user(&usize, (void __user *)uarg, sizeof(usize))) > > + return -EFAULT; > > + /* argument struct can never be that large, reject abuse */ > > + if (usize > PAGE_SIZE) > > + return -E2BIG; > > + /* argument struct should have at least query_flags and query_addr fields */ > > + if (usize < offsetofend(struct procmap_query, query_addr)) > > + return -EINVAL; > > + err = copy_struct_from_user(&karg, sizeof(karg), uarg, usize); > > + if (err) > > + return err; > > + > > + /* reject unknown flags */ > > + if (karg.query_flags & ~PROCMAP_QUERY_VALID_FLAGS_MASK) > > + return -EINVAL; > > + /* either both buffer address and size are set, or both should be zero */ > > + if (!!karg.vma_name_size != !!karg.vma_name_addr) > > + return -EINVAL; > > + > > + mm = priv->mm; > > + if (!mm || !mmget_not_zero(mm)) > > + return -ESRCH; > > + > > + err = query_vma_setup(mm); > > + if (err) { > > + mmput(mm); > > + return err; > > + } > > + > > + vma = query_matching_vma(mm, karg.query_addr, karg.query_flags); > > + if (IS_ERR(vma)) { > > + err = PTR_ERR(vma); > > + vma = NULL; > > + goto out; > > + } > > + > > + karg.vma_start = vma->vm_start; > > + karg.vma_end = vma->vm_end; > > + > > + karg.vma_flags = 0; > > + if (vma->vm_flags & VM_READ) > > + karg.vma_flags |= PROCMAP_QUERY_VMA_READABLE; > > + if (vma->vm_flags & VM_WRITE) > > + karg.vma_flags |= PROCMAP_QUERY_VMA_WRITABLE; > > + if (vma->vm_flags & VM_EXEC) > > + karg.vma_flags |= PROCMAP_QUERY_VMA_EXECUTABLE; > > + if (vma->vm_flags & VM_MAYSHARE) > > + karg.vma_flags |= PROCMAP_QUERY_VMA_SHARED; > > + > > + karg.vma_page_size = vma_kernel_pagesize(vma); > > + > ... > > > +/* > > + * Input/output argument structured passed into ioctl() call. It can be used > > + * to query a set of VMAs (Virtual Memory Areas) of a process. > > + * > > + * Each field can be one of three kinds, marked in a short comment to the > > + * right of the field: > > + * - "in", input argument, user has to provide this value, kernel doesn't modify it; > > + * - "out", output argument, kernel sets this field with VMA data; > > + * - "in/out", input and output argument; user provides initial value (used > > + * to specify maximum allowable buffer size), and kernel sets it to actual > > + * amount of data written (or zero, if there is no data). > > + * > > + * If matching VMA is found (according to criterias specified by > > + * query_addr/query_flags, all the out fields are filled out, and ioctl() > > + * returns 0. If there is no matching VMA, -ENOENT will be returned. > > + * In case of any other error, negative error code other than -ENOENT is > > + * returned. > > + * > > + * Most of the data is similar to the one returned as text in /proc/<pid>/maps > > + * file, but procmap_query provides more querying flexibility. There are no > > + * consistency guarantees between subsequent ioctl() calls, but data returned > > + * for matched VMA is self-consistent. > > + */ > > +struct procmap_query { > > + /* Query struct size, for backwards/forward compatibility */ > > + __u64 size; > > + /* > > + * Query flags, a combination of enum procmap_query_flags values. > > + * Defines query filtering and behavior, see enum procmap_query_flags. > > + * > > + * Input argument, provided by user. Kernel doesn't modify it. > > + */ > > + __u64 query_flags; /* in */ > > + /* > > + * Query address. By default, VMA that covers this address will > > + * be looked up. PROCMAP_QUERY_* flags above modify this default > > + * behavior further. > > + * > > + * Input argument, provided by user. Kernel doesn't modify it. > > + */ > > + __u64 query_addr; /* in */ > > + /* VMA starting (inclusive) and ending (exclusive) address, if VMA is found. */ > > + __u64 vma_start; /* out */ > > + __u64 vma_end; /* out */ > > + /* VMA permissions flags. A combination of PROCMAP_QUERY_VMA_* flags. */ > > + __u64 vma_flags; /* out */ > > + /* VMA backing page size granularity. */ > > + __u32 vma_page_size; /* out */ > > The vma_kernel_pagesize() returns an unsigned long. We could > potentially be truncating the returned value (although probably not > today?). This is from the vm_operations_struct pagesize, which also > returns an unsigned long. Could we switch this to __u64? > Yep, fair point. It seemed excessive to use u64, but I guess nothing prevents us (in the future) from having humongous pages with >4GB sizes. I'll update to __u64. I'll wait for a few more hours just in case I get some other feedback, and will rebase and post a new revision later today, thanks. > ... > > Thanks, > Liam