On Mon, Oct 20, 2014 at 6:48 AM, David Drysdale <drysdale@xxxxxxxxxx> wrote: > On Sun, Oct 19, 2014 at 1:20 AM, Eric W. Biederman > <ebiederm@xxxxxxxxxxxx> wrote: >> Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes: >> >>> [Added Eric Biederman, since I think your tree might be a reasonable >>> route forward for these patches.] >>> >>> On Thu, Jun 5, 2014 at 6:40 AM, David Drysdale <drysdale@xxxxxxxxxx> wrote: >>>> Resending, adding cc:linux-api. >>>> >>>> Also, it may help to add a little more background -- this patch is >>>> needed as a (small) part of implementing Capsicum in the Linux kernel. >>>> >>>> Capsicum is a security framework that has been present in FreeBSD since >>>> version 9.0 (Jan 2012), and is based on concepts from object-capability >>>> security [1]. >>>> >>>> One of the features of Capsicum is capability mode, which locks down >>>> access to global namespaces such as the filesystem hierarchy. In >>>> capability mode, /proc is thus inaccessible and so fexecve(3) doesn't >>>> work -- hence the need for a kernel-space >>> >>> I just found myself wanting this syscall for another reason: injecting >>> programs into sandboxes or otherwise heavily locked-down namespaces. >>> >>> For example, I want to be able to reliably do something like nsenter >>> --namespace-flags-here toybox sh. Toybox's shell is unusual in that >>> it is more or less fully functional, so this should Just Work (tm), >>> except that the toybox binary might not exist in the namespace being >>> entered. If execveat were available, I could rig nsenter or a similar >>> tool to open it with O_CLOEXEC, enter the namespace, and then call >>> execveat. >>> >>> Is there any reason that these patches can't be merged more or less as >>> is for 3.19? >> >> Yes. There is a silliness in how it implements fexecve. The fexecve >> case should be use the empty string "" not a NULL pointer to indication >> that. That change will then harmonize execveat with the other ...at >> system calls and simplify the code and remove a special case. I believe >> using the empty string "" requires implementing the AT_EMPTY_PATH flag. > > Good point -- I'll shift to "" + AT_EMPTY_PATH. Pending a better idea, I would also see if the patches can be changed to return an error if d_path ends up with an "(unreachable)" thing rather than failing inexplicably later on. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html