Re: [PATCH v7 18/26] x86/insn-eval: Add support to resolve 16-bit addressing encodings

Borislav Petkov <bp@xxxxxxx> · Wed, 7 Jun 2017 18:28:14 +0200

On Fri, May 05, 2017 at 11:17:16AM -0700, Ricardo Neri wrote:
> Tasks running in virtual-8086 mode or in protected mode with code
> segment descriptors that specify 16-bit default address sizes via the
> D bit will use 16-bit addressing form encodings as described in the Intel
> 64 and IA-32 Architecture Software Developer's Manual Volume 2A Section
> 2.1.5. 16-bit addressing encodings differ in several ways from the
> 32-bit/64-bit addressing form encodings: ModRM.rm points to different
> registers and, in some cases, effective addresses are indicated by the
> addition of the value of two registers. Also, there is no support for SIB
> bytes. Thus, a separate function is needed to parse this form of
> addressing.
> 
> A couple of functions are introduced. get_reg_offset_16() obtains the
> offset from the base of pt_regs of the registers indicated by the ModRM
> byte of the address encoding. get_addr_ref_16() computes the linear
> address indicated by the instructions using the value of the registers
> given by ModRM as well as the base address of the segment.
> 
> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> Cc: Adam Buchbinder <adam.buchbinder@xxxxxxxxx>
> Cc: Colin Ian King <colin.king@xxxxxxxxxxxxx>
> Cc: Lorenzo Stoakes <lstoakes@xxxxxxxxx>
> Cc: Qiaowei Ren <qiaowei.ren@xxxxxxxxx>
> Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> Cc: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
> Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> Cc: Thomas Garnier <thgarnie@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxx>
> Cc: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> Cc: Ravi V. Shankar <ravi.v.shankar@xxxxxxxxx>
> Cc: x86@xxxxxxxxxx
> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@xxxxxxxxxxxxxxx>
> ---
>  arch/x86/lib/insn-eval.c | 155 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 155 insertions(+)
> 
> diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
> index 9822061..928a662 100644
> --- a/arch/x86/lib/insn-eval.c
> +++ b/arch/x86/lib/insn-eval.c
> @@ -431,6 +431,73 @@ static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
>  }
>  
>  /**
> + * get_reg_offset_16 - Obtain offset of register indicated by instruction

Please end function names with parentheses.

> + * @insn:	Instruction structure containing ModRM and SiB bytes

s/SiB/SIB/g

> + * @regs:	Structure with register values as seen when entering kernel mode
> + * @offs1:	Offset of the first operand register
> + * @offs2:	Offset of the second opeand register, if applicable.
> + *
> + * Obtain the offset, in pt_regs, of the registers indicated by the ModRM byte
> + * within insn. This function is to be used with 16-bit address encodings. The
> + * offs1 and offs2 will be written with the offset of the two registers
> + * indicated by the instruction. In cases where any of the registers is not
> + * referenced by the instruction, the value will be set to -EDOM.
> + *
> + * Return: 0 on success, -EINVAL on failure.
> + */
> +static int get_reg_offset_16(struct insn *insn, struct pt_regs *regs,
> +			     int *offs1, int *offs2)
> +{
> +	/* 16-bit addressing can use one or two registers */
> +	static const int regoff1[] = {
> +		offsetof(struct pt_regs, bx),
> +		offsetof(struct pt_regs, bx),
> +		offsetof(struct pt_regs, bp),
> +		offsetof(struct pt_regs, bp),
> +		offsetof(struct pt_regs, si),
> +		offsetof(struct pt_regs, di),
> +		offsetof(struct pt_regs, bp),
> +		offsetof(struct pt_regs, bx),
> +	};
> +
> +	static const int regoff2[] = {
> +		offsetof(struct pt_regs, si),
> +		offsetof(struct pt_regs, di),
> +		offsetof(struct pt_regs, si),
> +		offsetof(struct pt_regs, di),
> +		-EDOM,
> +		-EDOM,
> +		-EDOM,
> +		-EDOM,
> +	};

You mean "Table 2-1. 16-Bit Addressing Forms with the ModR/M Byte" in
the SDM, right?

Please add a comment pointing to it here because it is not trivial to
map that code to the documentation.

> +
> +	if (!offs1 || !offs2)
> +		return -EINVAL;
> +
> +	/* operand is a register, use the generic function */
> +	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
> +		*offs1 = insn_get_modrm_rm_off(insn, regs);
> +		*offs2 = -EDOM;
> +		return 0;
> +	}
> +
> +	*offs1 = regoff1[X86_MODRM_RM(insn->modrm.value)];
> +	*offs2 = regoff2[X86_MODRM_RM(insn->modrm.value)];
> +
> +	/*
> +	 * If no displacement is indicated in the mod part of the ModRM byte,

s/"no "//

> +	 * (mod part is 0) and the r/m part of the same byte is 6, no register
> +	 * is used caculate the operand address. An r/m part of 6 means that
> +	 * the second register offset is already invalid.
> +	 */
> +	if ((X86_MODRM_MOD(insn->modrm.value) == 0) &&
> +	    (X86_MODRM_RM(insn->modrm.value) == 6))
> +		*offs1 = -EDOM;
> +
> +	return 0;
> +}
> +
> +/**
>   * get_desc() - Obtain address of segment descriptor
>   * @sel:	Segment selector
>   *
> @@ -689,6 +756,94 @@ int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs)
>  }
>  
>  /**
> + * get_addr_ref_16() - Obtain the 16-bit address referred by instruction
> + * @insn:	Instruction structure containing ModRM byte and displacement
> + * @regs:	Structure with register values as seen when entering kernel mode
> + *
> + * This function is to be used with 16-bit address encodings. Obtain the memory
> + * address referred by the instruction's ModRM bytes and displacement. Also, the
> + * segment used as base is determined by either any segment override prefixes in
> + * insn or the default segment of the registers involved in the address
> + * computation. In protected mode, segment limits are enforced.
> + *
> + * Return: linear address referenced by instruction and registers on success.
> + * -1L on failure.
> + */
> +static void __user *get_addr_ref_16(struct insn *insn, struct pt_regs *regs)
> +{
> +	unsigned long linear_addr, seg_base_addr, seg_limit;
> +	short eff_addr, addr1 = 0, addr2 = 0;
> +	int addr_offset1, addr_offset2;
> +	int ret;
> +
> +	insn_get_modrm(insn);
> +	insn_get_displacement(insn);
> +
> +	/*
> +	 * If operand is a register, the layout is the same as in
> +	 * 32-bit and 64-bit addressing.
> +	 */
> +	if (X86_MODRM_MOD(insn->modrm.value) == 3) {
> +		addr_offset1 = get_reg_offset(insn, regs, REG_TYPE_RM);
> +		if (addr_offset1 < 0)
> +			goto out_err;

<---- newline here.

> +		eff_addr = regs_get_register(regs, addr_offset1);
> +		seg_base_addr = insn_get_seg_base(regs, insn, addr_offset1);
> +		if (seg_base_addr == -1L)
> +			goto out_err;

ditto.

> +		seg_limit = get_seg_limit(regs, insn, addr_offset1);
> +	} else {
> +		ret = get_reg_offset_16(insn, regs, &addr_offset1,
> +					&addr_offset2);
> +		if (ret < 0)
> +			goto out_err;

ditto.

> +		/*
> +		 * Don't fail on invalid offset values. They might be invalid
> +		 * because they cannot be used for this particular value of
> +		 * the ModRM. Instead, use them in the computation only if
> +		 * they contain a valid value.
> +		 */
> +		if (addr_offset1 != -EDOM)
> +			addr1 = 0xffff & regs_get_register(regs, addr_offset1);
> +		if (addr_offset2 != -EDOM)
> +			addr2 = 0xffff & regs_get_register(regs, addr_offset2);
> +		eff_addr = addr1 + addr2;

ditto.

Space those codelines out, we want to be able to read that code again at
some point :-)))

> +		/*
> +		 * The first register is in the operand implies the SS or DS
> +		 * segment selectors, the second register in the operand can
> +		 * only imply DS. Thus, use the first register to obtain
> +		 * the segment selector.
> +		 */
> +		seg_base_addr = insn_get_seg_base(regs, insn, addr_offset1);
> +		if (seg_base_addr == -1L)
> +			goto out_err;
> +		seg_limit = get_seg_limit(regs, insn, addr_offset1);
> +
> +		eff_addr += (insn->displacement.value & 0xffff);
> +	}
> +
> +	linear_addr = (unsigned long)(eff_addr & 0xffff);
> +
> +	/*
> +	 * Make sure the effective address is within the limits of the
> +	 * segment. In long mode, the limit is -1L. Thus, the second part

Long mode in a 16-bit handling function?

> +	 * of the check always succeeds.
> +	 */
> +	if (linear_addr > seg_limit)
> +		goto out_err;
> +
> +	linear_addr += seg_base_addr;
> +
> +	/* Limit linear address to 20 bits */
> +	if (v8086_mode(regs))
> +		linear_addr &= 0xfffff;
> +
> +	return (void __user *)linear_addr;
> +out_err:
> +	return (void __user *)-1;
> +}
> +
> +/**
>   * _to_signed_long() - Cast an unsigned long into signed long
>   * @val		A 32-bit or 64-bit unsigned long
>   * @long_bytes	The number of bytes used to represent a long number
> -- 
> 2.9.3
> 

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html