On Tue, 2016-12-27 at 16:48 -0800, Andy Lutomirski wrote: > On Tue, Dec 27, 2016 at 4:39 PM, Ricardo Neri > <ricardo.neri-calderon@xxxxxxxxxxxxxxx> wrote: > > On Fri, 2016-12-23 at 18:11 -0800, Andy Lutomirski wrote: > >> On Fri, Dec 23, 2016 at 5:37 PM, Ricardo Neri > >> <ricardo.neri-calderon@xxxxxxxxxxxxxxx> wrote: > >> > The feature User-Mode Instruction Prevention present in recent Intel > >> > processor prevents a group of instructions from being executed with > >> > CPL > 0. Otherwise, a general protection fault is issued. > >> > > >> > Rather than relaying this fault to the user space (in the form of a SIGSEGV > >> > signal), the instructions protected by UMIP can be emulated to provide > >> > dummy results. This allows to conserve the current kernel behavior and not > >> > reveal the system resources that UMIP intends to protect (the global > >> > descriptor and interrupt descriptor tables, the segment selectors of the > >> > local descriptor table and the task state and the machine status word). > >> > > >> > This emulation is needed because certain applications (e.g., WineHQ) rely > >> > on this subset of instructions to function. > >> > > >> > The instructions protected by UMIP can be split in two groups. Those who > >> > return a kernel memory address (sgdt and sidt) and those who return a > >> > value (sldt, str and smsw). > >> > > >> > For the instructions that return a kernel memory address, the result is > >> > emulated as the location of a dummy variable in the kernel memory space. > >> > This is needed as applications such as WineHQ rely on the result being > >> > located in the kernel memory space function. The limit for the GDT and the > >> > IDT are set to zero. > >> > >> Nak. This is a trivial KASLR bypass. Just give them hardcoded > >> values. For x86_64, I would suggest 0xfffffffffffe0000 and > >> 0xffffffffffff0000. > > > > I see. I assume you are suggesting these values for x86_64 because they > > lie in an unused hole. That makes sense to me. > > > > For the case of x86_32, I have trouble finding a suitable place as there > > are not many available holes. It could be put before VMALLOC_START or > > after VMALLOC_END but this would reveal the position of the vmalloc > > area. Although, to my knowledge, randomized memory is not available for > > x86_32. Without randomization, does it hurt to make sidt/sgdt return the > > address of a kernel static variable? > > I would just use the same addresses, truncated. There's no reason > that the address needs to be truly not present -- it just needs to be > inaccessible to user code. Anything near the top of the address space > should work. Right. Then I will reuse the same addresses. > > > > >> > >> > > >> > The instructions sldt and str return a segment selector relative to the > >> > base address of the global descriptor table. Since the actual address of > >> > such table is not revealed, it makes sense to emulate the result as zero. > >> > >> Hmm, now I wonder if anything uses SLDT to see if there is an LDT. If > >> so, we could emulate it better, but I doubt this matters. > > > > So you are saying that the emulated sldt should return a different value > > based on the presence/absence of a LDT? This could reveal this very > > fact. > > User code knows whether the LDT exists because an LDT only exists if > the program called modify_ldt(). But I doubt this matters in > practice. In such a case sldt would return a non-null segment selector. I will keep giving the null segment selector in all cases and make a note in the code. > > >> > +static int __emulate_umip_insn(struct insn *insn, enum umip_insn umip_inst, > >> > + unsigned char *data, int *data_size) > >> > +{ > >> > + unsigned long const *dummy_base_addr; > >> > + unsigned short dummy_limit = 0; > >> > + unsigned short dummy_value = 0; > >> > + > >> > + switch (umip_inst) { > >> > + /* > >> > + * These two instructions return the base address and limit of the > >> > + * global and interrupt descriptor table. The base address can be > >> > + * 32-bit or 64-bit. Limit is always 16-bit. > >> > + */ > >> > + case UMIP_SGDT: > >> > + case UMIP_SIDT: > >> > + if (umip_inst == UMIP_SGDT) > >> > + dummy_base_addr = &umip_dummy_gdt_base; > >> > + else > >> > + dummy_base_addr = &umip_dummy_idt_base; > >> > + if (X86_MODRM_MOD(insn->modrm.value) == 3) { > >> > + WARN_ONCE(1, "SGDT cannot take register as argument!\n"); > >> > >> No warnings please. > > > > I'll. Remove it. > > Thanks. In general, WARN_ONCE, etc are supposed to indicate kernel > bugs, not user bugs. Agreed. Your statement makes it very clear. I didn't have it that clear in my mind. > > >> > + int not_copied, nr_copied, reg_offset, dummy_data_size; > >> > + void __user *uaddr; > >> > + unsigned long *reg_addr; > >> > + enum umip_insn umip_inst; > >> > + > >> > + not_copied = copy_from_user(buf, (void __user *)regs->ip, sizeof(buf)); > >> > >> This is slightly wrong due to PKRU. I doubt we care. > > > > I see. If I am not mistaken, if the memory is protected by a protection > > key this would cause a page fault. I'll make a note of it. > > Exactly. This is correct behavior unless the key happens to be set up > so it can be executed but not read, in which case emulation will fail. If we can't read we can't emulate anyways. I will add another note. > > >> > >> > + nr_copied = sizeof(buf) - not_copied; > >> > + /* > >> > + * The decoder _should_ fail nicely if we pass it a short buffer. > >> > + * But, let's not depend on that implementation detail. If we > >> > + * did not get anything, just error out now. > >> > + */ > >> > + if (!nr_copied) > >> > + return -EFAULT; > >> > >> If the caller cares about EINVAL vs EFAULT, it cares because it is > >> considering changing the signal to a fake page fault. If so, then > >> this should be EINVAL -- failure to read the text should just prevent > >> emulation. > > > > I see. The caller in this case do_general_protection, which will issue a > > SIGSEGV to the user space anyways. I don't think it cares about the > > EINVAL vs EFAULT. It does care about whether the emulation was > > successful. > > Maybe just make it return bool then? But fixing up the return codes > would be fine, too. I just think that, if it returns int, the value > should be meaningful. Right. I have a small query/proposal related to this below. > > >> > + if (nr_copied > 0) > >> > + return -EFAULT; > >> > >> This should be the only EFAULT case. > > Should this be EFAULT event if the caller cares only about successful > > (return 0) vs failed (return non-0) emulation? > > In theory this particular error would be a page fault not a general > protection fault (in the UMIP off case). If you were emulating it > extra carefully, you could change the signal accordingly. But, as I > said, I really doubt this matters. If simple enough and for the sake of accuracy, I could try to issue the page fault. It seems to me that this entitles calling force_sig_info_fault in this particular case as opposed to the force_sig_info(SIGSEGV, SEND_SIG_PRIV, tsk) that do_general_protection calls. Thanks and BR, Ricardo -- To unsubscribe from this list: send the line "unsubscribe linux-msdos" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html