On Thu, Aug 06, 2020 at 12:26:02PM -0500, Madhavan T. Venkataraman wrote: > Thanks for the lively discussion. I have tried to answer some of the > comments below. > > On 8/4/20 9:30 AM, Mark Rutland wrote: > > > >> So, the context is - if security settings in a system disallow a page to have > >> both write and execute permissions, how do you allow the execution of > >> genuine trampolines that are runtime generated and placed in a data > >> page or a stack page? > > There are options today, e.g. > > > > a) If the restriction is only per-alias, you can have distinct aliases > > where one is writable and another is executable, and you can make it > > hard to find the relationship between the two. > > > > b) If the restriction is only temporal, you can write instructions into > > an RW- buffer, transition the buffer to R--, verify the buffer > > contents, then transition it to --X. > > > > c) You can have two processes A and B where A generates instrucitons into > > a buffer that (only) B can execute (where B may be restricted from > > making syscalls like write, mprotect, etc). > > The general principle of the mitigation is W^X. I would argue that > the above options are violations of the W^X principle. If they are > allowed today, they must be fixed. And they will be. So, we cannot > rely on them. Hold on. Contemporary W^X means that a given virtual alias cannot be writeable and executeable simultaneously, permitting (a) and (b). If you read the references on the Wikipedia page for W^X you'll see the OpenBSD 3.3 release notes and related presentation make this clear, and further they expect (b) to occur with JITS flipping W/X with mprotect(). Please don't conflate your assumed stronger semantics with the general principle. It not matching you expectations does not necessarily mean that it is wrong. If you want a stronger W^X semantics, please refer to this specifically with a distinct name. > a) This requires a remap operation. Two mappings point to the same > physical page. One mapping has W and the other one has X. This > is a violation of W^X. > > b) This is again a violation. The kernel should refuse to give execute > permission to a page that was writeable in the past and refuse to > give write permission to a page that was executable in the past. > > c) This is just a variation of (a). As above, this is not true. If you have a rationale for why this is desirable or necessary, please justify that before using this as justification for additional features. > In general, the problem with user-level methods to map and execute > dynamic code is that the kernel cannot tell if a genuine application is > using them or an attacker is using them or piggy-backing on them. Yes, and as I pointed out the same is true for trampfd unless you can somehow authenticate the calls are legitimate (in both callsite and the set of arguments), and I don't see any reasonable way of doing that. If you relax your threat model to an attacker not being able to make arbitrary syscalls, then your suggestion that userspace can perorm chceks between syscalls may be sufficient, but as I pointed out that's equally true for a sealed memfd or similar. > Off the top of my head, I have tried to identify some examples > where we can have more trust on dynamic code and have the kernel > permit its execution. > > 1. If the kernel can do the job, then that is one safe way. Here, the kernel > is the code. There is no code generation involved. This is what I > have presented in the patch series as the first cut. This is sleight-of-hand; it doesn't matter where the logic is performed if the power is identical. Practically speaking this is equivalent to some dynamic code generation. I think that it's misleading to say that because the kernel emulates something it is safe when the provenance of the syscall arguments cannot be verified. [...] > Anyway, these are just examples. The principle is - if we can identify > dynamic code that has a certain measure of trust, can the kernel > permit their execution? My point generally is that the kernel cannot identify this, and if usrspace code is trusted to dynamically generate trampfd arguments it can equally be trusted to dyncamilly generate code. [...] > As I have mentioned above, I intend to have the kernel generate code > only if the code generation is simple enough. For more complicated cases, > I plan to use a user-level code generator that is for exclusive kernel use. > I have yet to work out the details on how this would work. Need time. This reads to me like trampfd is only dealing with a few special cases and we know that we need a more general solution. I hope I am mistaken, but I get the strong impression that you're trying to justify your existing solution rather than trying to understand the problem space. To be clear, my strong opinion is that we should not be trying to do this sort of emulation or code generation within the kernel. I do think it's worthwhile to look at mechanisms to make it harder to subvert dynamic userspace code generation, but I think the code generation itself needs to live in userspace (e.g. for ABI reasons I previously mentioned). Mark.