This is v6 of this series. The five previous submissions can be found here [1], here [2], here[3], here[4], and here[5]. This version addresses the comments received in v4 plus improvements of the handling of emulation in 64-bit builds. Please see details in the change log. === What is UMIP? User-Mode Instruction Prevention (UMIP) is a security feature present in new Intel Processors. If enabled, it prevents the execution of certain instructions if the Current Privilege Level (CPL) is greater than 0. If these instructions were executed while in CPL > 0, user space applications could have access to system-wide settings such as the global and local descriptor tables, the segment selectors to the current task state and the local descriptor table. These are the instructions covered by UMIP: * SGDT - Store Global Descriptor Table * SIDT - Store Interrupt Descriptor Table * SLDT - Store Local Descriptor Table * SMSW - Store Machine Status Word * STR - Store Task Register If any of these instructions is executed with CPL > 0, a general protection exception is issued when UMIP is enabled. === How does it impact applications? There is a caveat, however. Certain applications rely on some of these instructions to function. An example of this are applications that use WineHQ[6]. For instance, these applications rely on sidt returning a non- accessible memory location[8]. During the discussions, it was proposed that the fault could be relied to the user-space and perform the emulation in user-mode. However, this would break existing applications until, for instance, they update to a new WineHQ version. However, this approach would require UMIP to be disabled by default. The consensus in this forum is to always enable it. This patchset initially treated tasks running in virtual-8086 mode as a special case. However, I received clarification that DOSEMU[8] does not support applications that use these instructions. It relies on WineHQ for this [9]. Furthermore, the applications for which the concern was raised run in protected mode [8]. Please note that UMIP is always enabled for both 64-bit and 32-bit Linux builds. However, emulation of the UMIP-protected instructions is not done for 64-bit processes. 64-bit user space applications will receive the SIGSEGV signal when UMIP instructions causes a general protection fault. === How are UMIP-protected instructions emulated? This version keeps UMIP enabled at all times and by default. If a general protection fault caused by the instructions protected by UMIP is detected, such fault will be fixed-up by returning dummy values as follows: * SGDT and SIDT return hard-coded dummy values as the base of the global descriptor and interrupt descriptor tables. These hard-coded values correspond to memory addresses that are near the end of the kernel memory map. This is also the case for virtual-8086 mode tasks. In all my experiments in x86_32, the base of GDT and IDT was always a 4-byte address, even for 16-bit operands. Thus, my emulation code does the same. In all cases, the limit of the table is set to 0. * STR and SLDT return 0 as the segment selector. This looks appropriate since we are providing a dummy value as the base address of the global descriptor table. * SMSW returns the value with which the CR0 register is programmed in head_32/64.S at boot time. This is, the following bits are enabled: CR0.0 for Protection Enable, CR.1 for Monitor Coprocessor, CR.4 for Extension Type, which will always be 1 in recent processors with UMIP; CR.5 for Numeric Error, CR0.16 for Write Protect, CR0.18 for Alignment Mask. As per the Intel 64 and IA-32 Architectures Software Developer's Manual, SMSW returns a 16-bit results for memory operands. However, when the operand is a register, the results can be up to CR0[63:0]. Since the emulation code only kicks-in in x86_32, we return up to CR[31:0]. * The proposed emulation code is handles faults that happens in both protected and virtual-8086 mode. === How is this series laid out? ++ Fix bugs in MPX address evaluator I found very useful the code for Intel MPX (Memory Protection Extensions) used to parse opcodes and the memory locations contained in the general purpose registers when used as operands. I put some of this code in a separate library file that both MPX and UMIP can access and avoid code duplication. Before creating the new library, I fixed a couple of bugs that I found in how MPX determines the address contained in the instruction and operands. ++ Provide a new x86 instruction evaluating library With bugs fixed, the MPX evaluating code is relocated in a new insn-eval.c library. The basic functionality of this library is extended to obtain the segment descriptor selected by either segment override prefixes or the default segment by the involved registers in the calculation of the effective address. It was also extended to obtain the default address and operand sizes as well as the segment base address. Also, support to process 16-bit address encodings. Armed with this arsenal, it is now possible to determine the linear address onto which the emulated results shall be copied. This code supports Normal 32-bit and 64-bit (i.e., __USER32_CS and/or __USER_CS) protected mode, virtual-8086 mode, 16-bit protected mode with 32-bit base address. ++ Emulate UMIP instructions A new fixup_umip_exception functions inspect the instruction at the instruction pointer. If it is an UMIP-protected instruction, it executes the emulation code. This uses all the address-computing code of the previous section. ++ Add self-tests Lastly, self-tests are added to entry_from_v86.c to exercise the most typical use cases of UMIP-protected instructions in a virtual-8086 mode. ++ Extensive tests Extensive tests were performed to test all the combinations of ModRM, SiB and displacements for 16-bit and 32-bit encodings for the ss, ds, es, fs and gs segments. Tests also include a 64-bit program that uses segmentation via fs and gs. For this purpose, I temporarily, and not as part of this patchset, enabled UMIP support for 64-bit process with the intention to test the computations of linear addresses in 64-bit mode, including the extra R8-R15 registers. Extensive test is also implemented for virtual-8086 tasks. Code of these tests can be found here [10] and here [11]. [1]. https://lwn.net/Articles/705877/ [2]. https://lkml.org/lkml/2016/12/23/265 [3]. https://lkml.org/lkml/2017/1/25/622 [4]. https://lkml.org/lkml/2017/2/23/40 [5]. https://lkml.org/lkml/2017/3/3/678 [7]. https://www.winehq.org/ [8]. https://www.winehq.org/pipermail/wine-devel/2016-November/115320.html [9]. http://www.dosemu.org/ [9]. http://marc.info/?l=linux-kernel&m=147876798717927&w=2 [10]. https://github.com/01org/luv-yocto/tree/rneri/umip/meta-luv/recipes-core/umip/files [11]. https://github.com/01org/luv-yocto/commit/a72a7fe7d68693c0f4100ad86de6ecabde57334f#diff-3860c136a63add269bce4ea50222c248R1 Thanks and BR, Ricardo Changes since V5: * Relocate the page fault error code enumerations to traps.h Changes since V4: * Audited patches to use braces in all the branches of conditional. statements, except those in which the conditional action only takes one line. * Implemented support in 64-builds for both 32-bit and 64-bit tasks in the instruction evaluating library. * Split segment selector function in the instruction evaluating library into two functions to resolve the segment type by instruction override or default and a separate function to actually read the segment selector. * Fixed a bug when evaluating 32-bit effective addresses with 64-bit kernels. * Split patches further for for easier review. * Use signed variables for computation of effective address. * Fixed issue with a spurious static modifier in function insn_get_addr_ref found by kbuild test bot. * Removed comparison between true and fixup_umip_exception. * Reworked check logic when identifying erroneous vs invalid values of the SiB base and index. Changes since V3: * Limited emulation to 32-bit and 16-bit modes. For 64-bit mode, a general protection fault is still issued when UMIP-protected instructions are executed with CPL > 0. * Expanded instruction-evaluating code to obtain segment descriptor along with their attributes such as base address and default address and operand sizes. Also, support for 16-bit encodings in protected mode was implemented. * When getting a segment descriptor, this include support to obtain those of a local descriptor table. * Now the instruction-evaluating code returns -EDOM when the value of registers should not be used in calculating the effective address. The value -EINVAL is left for errors. * Incorporate the value of the segment base address in the computation of linear addresses. * Renamed new instruction evaluation library from insn-kernel.c to insn-eval.c * Exported functions insn_get_reg_offset_* to obtain the register offset by ModRM r/m, SiB base and SiB index. * Improved documentation of functions. * Split patches further for easier review. Changes since V2: * Added new utility functions to decode the memory addresses contained in registers when the 16-bit addressing encodings are used. This includes code to obtain and compute memory addresses using segment selectors for real-mode address translation. * Added support to emulate UMIP-protected instructions for virtual-8086 tasks. * Added self-tests for virtual-8086 mode that contains representative use cases: address represented as a displacement, address in registers and registers as operands. * Instead of maintaining a static variable for the dummy base addresses of the IDT and GDT, a hard-coded value is used. * The emulated SMSW instructions now return the value with which the CR0 register is programmed in head_32/64.S This is: PE | MP | ET | NE | WP | AM. For x86_64, PG is also enabled. * The new file arch/x86/lib/insn-utils.c is now renamed as arch/x86/lib/ insn-kernel.c. It also has its own header. This helps keep in sync the the kernel and objtool instruction decoders. Also, the new insn-kernel.c contains utility functions that are only relevant in a kernel context. * Removed printed warnings for errors that occur when decoding instructions with invalid operands. * Added more comments on fixes in the instruction-decoding MPX functions. * Now user_64bit_mode(regs) is used instead of test_thread_flag(TIF_IA32) to determine if the task is 32-bit or 64-bit. * Found and fixed a bug in insn-decoder in which X86_MODRM_RM was incorrectly used to obtain the mod part of the ModRM byte. * Added more explanatory code in emulation and instruction decoding code. This includes a comment regarding that copy_from_user could fail if there exists a memory protection key in place. * Tested code with CONFIG_X86_DECODER_SELFTEST=y and everything passes now. * Prefixed get_reg_offset_rm with insn_ as this function is exposed via a header file. For clarity, this function was added in a separate patch. Changes since V1: * Virtual-8086 mode tasks are not treated in a special manner. All code for this purpose was removed. * Instead of attempting to disable UMIP during a context switch or when entering virtual-8086 mode, UMIP remains enabled all the time. General protection faults that occur are fixed-up by returning dummy values as detailed above. * Removed umip= kernel parameter in favor of using clearcpuid=514 to disable UMIP. * Removed selftests designed to detect the absence of SIGSEGV signals when running in virtual-8086 mode. * Reused code from MPX to decode instructions operands. For this purpose code was put in a common location. * Fixed two bugs in MPX code that decodes operands. Ricardo Neri (21): x86/mpx: Use signed variables to compute effective addresses x86/mpx: Do not use SIB index if index points to R/ESP x86/mpx: Do not use R/EBP as base in the SIB byte with Mod = 0 x86/mpx, x86/insn: Relocate insn util functions to a new insn-kernel x86/insn-eval: Add utility functions to get register offsets x86/insn-eval: Add utility functions to get segment selector x86/insn-eval: Add utility function to get segment descriptor x86/insn-eval: Add utility function to get segment descriptor base address x86/insn-eval: Add functions to get default operand and address sizes x86/insn-eval: Do not use R/EBP as base if mod in ModRM is zero insn/eval: Incorporate segment base in address computation x86/insn: Support both signed 32-bit and 64-bit effective addresses x86/insn-eval: Add support to resolve 16-bit addressing encodings x86/insn-eval: Add wrapper function for 16-bit and 32-bit address encodings x86/mm: Relocate page fault error codes to traps.h x86/cpufeature: Add User-Mode Instruction Prevention definitions x86: Add emulation code for UMIP instructions x86/umip: Force a page fault when unable to copy emulated result to user x86/traps: Fixup general protection faults caused by UMIP x86: Enable User-Mode Instruction Prevention selftests/x86: Add tests for User-Mode Instruction Prevention arch/x86/Kconfig | 10 + arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/disabled-features.h | 8 +- arch/x86/include/asm/insn-eval.h | 23 + arch/x86/include/asm/traps.h | 18 + arch/x86/include/asm/umip.h | 15 + arch/x86/include/uapi/asm/processor-flags.h | 2 + arch/x86/kernel/Makefile | 1 + arch/x86/kernel/cpu/common.c | 16 +- arch/x86/kernel/traps.c | 4 + arch/x86/kernel/umip.c | 298 +++++++++ arch/x86/lib/Makefile | 2 +- arch/x86/lib/insn-eval.c | 832 ++++++++++++++++++++++++++ arch/x86/mm/fault.c | 88 ++- arch/x86/mm/mpx.c | 120 +--- tools/testing/selftests/x86/entry_from_vm86.c | 39 +- 16 files changed, 1301 insertions(+), 176 deletions(-) create mode 100644 arch/x86/include/asm/insn-eval.h create mode 100644 arch/x86/include/asm/umip.h create mode 100644 arch/x86/kernel/umip.c create mode 100644 arch/x86/lib/insn-eval.c -- 2.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-msdos" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html