Kees Cook <keescook@xxxxxxxxxxxx> writes: > On Tue, Oct 10, 2023 at 09:21:33AM +0000, Alyssa Ross wrote: >> As far as I can tell, the S_ISREG() check is there to prevent >> executing files where that would be nonsensical, like directories, >> fifos, or sockets. But the semantics for executing a block device are >> quite obvious — the block device acts just like a regular file. >> >> My use case is having a common VM image that takes a configurable >> payload to run. The payload will always be a single ELF file. >> >> I could share the file with virtio-fs, or I could create a disk image >> containing a filesystem containing the payload, but both of those add >> unnecessary layers of indirection when all I need to do is share a >> single executable blob with the VM. Sharing it as a block device is >> the most natural thing to do, aside from the (arbitrary, as far as I >> can tell) restriction on executing block devices. (The only slight >> complexity is that I need to ensure that my payload size is rounded up >> to a whole number of sectors, but that's trivial and fast in >> comparison to e.g. generating a filesystem image.) >> >> Signed-off-by: Alyssa Ross <hi@xxxxxxxxx> > > Hi, > > Thanks for the suggestion! I would prefer to not change this rather core > behavior in the kernel for a few reasons, but it mostly revolves around > both user and developer expectations and the resulting fragility. > > For users, this hasn't been possible in the past, so if we make it > possible, what situations are suddenly exposed on systems that are trying > to very carefully control their execution environments? I expect very few, considering it's still necessary to have root chmod the block device to make it executable. > For developers, this ends up exercising code areas that have never been > tested, and could lead to unexpected conditions. For example, > deny_write_access() is explicitly documented as "for regular files". > Perhaps it accidentally works with block devices, but this would need > much more careful examination, etc. > > And while looking at this from a design perspective, it looks like a > layering violation: roughly speaking, the kernel execute files, from > filesystems, from block devices. Bypassing layers tends to lead to > troublesome bugs and other weird problems. > > I wonder, though, if you can already get what you need through other > existing mechanisms that aren't too much more hassle? For example, > what about having a tool that creates a memfd from a block device and > executes that? The memfd code has been used in a lot of odd exec corner > cases in the past... Is it possible to have a file-backed memfd? Strange name if so!
Attachment:
signature.asc
Description: PGP signature