On Wed, Feb 19, 2025 at 4:18 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Thu, 13 Feb 2025 07:21:23 -0800 Jordan Rome <linux@xxxxxxxxxxxxxx> wrote: > > > Similar to `access_process_vm` but specific to strings. > > Also chunks reads by page and utilizes `strscpy` > > for handling null termination. > > > > The primary motivation for this change is to copy > > strings from a non-current task/process in BPF. > > There is already a helper `bpf_copy_from_user_task`, > > which uses `access_process_vm` but one to handle > > strings would be very helpful. > > > > ... > > > > include/linux/mm.h | 3 ++ > > mm/memory.c | 122 +++++++++++++++++++++++++++++++++++++++++++++ > > mm/nommu.c | 76 ++++++++++++++++++++++++++++ > > Is there any way in which we can avoid adding all this to vmlinux if > it's unneeded? > > Any such ifdeffery would of course need removal or alteration if > callers other than BPF emerge. > yeah, it's a straightforward #ifdef CONFIG_BPF_SYSCALL guard, I'll add it while applying > > ... > > > > +/* > > + * Copy a string from another process's address space as given in mm. > > + * If there is any error return -EFAULT. > > + */ > > +static int __copy_remote_vm_str(struct mm_struct *mm, unsigned long addr, > > + void *buf, int len, unsigned int gup_flags) > > +{ > > + void *old_buf = buf; > > + int err = 0; > > + > > + *(char *)buf = '\0'; > > + > > + if (mmap_read_lock_killable(mm)) > > + return -EFAULT; > > + > > + /* Untag the address before looking up the VMA */ > > + addr = untagged_addr_remote(mm, addr); > > Well that's a crappy little comment which you copied-n-pasted. It > tells us "what" (which is utterly obvious) but not "why". whodidthat. :) dropped the comment, but yeah, it's coming from __access_remote_vm(), of course. Seems other users of untagged_addr_remote() don't bother leaving comment, so dropping the comment seems appropriate (and anyone can actually read more expanded comment in include/linux/uaccess.h, if curious) > > > +/** > > + * copy_remote_vm_str - copy a string from another process's address space. > > + * @tsk: the task of the target address space > > + * @addr: start address to read from > > + * @buf: destination buffer > > + * @len: number of bytes to copy > > + * @gup_flags: flags modifying lookup behaviour > > + * > > + * The caller must hold a reference on @mm. > > + * > > + * Return: number of bytes copied from @addr (source) to @buf (destination); > > + * not including the trailing NUL. Always guaranteed to leave NUL-terminated > > + * buffer. On any error, return -EFAULT. > > + */ > > +int copy_remote_vm_str(struct task_struct *tsk, unsigned long addr, > > + void *buf, int len, unsigned int gup_flags) > > +{ > > + struct mm_struct *mm; > > + int ret; > > + > > + if (unlikely(len < 1)) > > + return 0; > > I wonder if this can ever happen. And if it does, should it WARN? And > returen -Efoo? so this was not meant to catch negative len (that's assumed invalid API usage), it was more about handling len == 0 case, which is legal for access_remote_vm(). So I fixed it up to `if (unlikely(len == 0)) return 0;` explicitly to keep behavior consistent with access_remote_vm(). > > > + mm = get_task_mm(tsk); > > + if (!mm) { > > + *(char *)buf = '\0'; > > + return -EFAULT; > > + } > > + > > + ret = __copy_remote_vm_str(mm, addr, buf, len, gup_flags); > > + > > + mmput(mm); > > + > > + return ret; > > +} >