Re: [v3 PATCH 05/10] x86/insn-kernel: Add support to resolve 16-bit addressing encodings

Ricardo Neri <ricardo.neri-calderon@xxxxxxxxxxxxxxx> · Thu, 26 Jan 2017 19:44:23 -0800

On Thu, 2017-01-26 at 09:05 -0800, Andy Lutomirski wrote:
> On Wed, Jan 25, 2017 at 9:50 PM, Ricardo Neri
> <ricardo.neri-calderon@xxxxxxxxxxxxxxx> wrote:
> > On Wed, 2017-01-25 at 13:58 -0800, Andy Lutomirski wrote:
> >> On Wed, Jan 25, 2017 at 12:23 PM, Ricardo Neri
> >> <ricardo.neri-calderon@xxxxxxxxxxxxxxx> wrote:
> >> > Tasks running in virtual-8086 mode will use 16-bit addressing form
> >> > encodings as described in the Intel 64 and IA-32 Architecture Software
> >> > Developer's Manual Volume 2A Section 2.1.5. 16-bit addressing encodings
> >> > differ in several ways from the 32-bit/64-bit addressing form encodings:
> >> > the r/m part of the ModRM byte points to different registers and, in some
> >> > cases, addresses can be indicated by the addition of the value of two
> >> > registers. Also, there is no support for SiB bytes. Thus, a separate
> >> > function is needed to parse this form of addressing.
> >> >
> >> > Furthermore, virtual-8086 mode tasks will use real-mode addressing. This
> >> > implies that the segment selectors do not point to a segment descriptor
> >> > but are used to compute logical addresses. Hence, there is a need to
> >> > add support to compute addresses using the segment selectors. If segment-
> >> > override prefixes are present in the instructions, they take precedence.
> >> >
> >> > Lastly, it is important to note that when a tasks is running in virtual-
> >> > 8086 mode and an interrupt/exception occurs, the CPU pushes to the stack
> >> > the segment selectors for ds, es, fs and gs. These are accesible via the
> >> > struct kernel_vm86_regs rather than pt_regs.
> >> >
> >> > Code for 16-bit addressing encodings is likely to be used only by virtual-
> >> > 8086 mode tasks. Thus, this code is wrapped to be built only if the
> >> > option CONFIG_VM86 is selected.
> >>
> >> That's not true.  It's used in 16-bit protected mode, too.  And there
> >> are (ugh!) six possibilities:
> >
> > Thanks for the clarification. I will enable the decoding of addresses
> > for 16-bit as well... and test the emulation code.
> >>
> >>  - Normal 32-bit protected mode.  This should already work.
> >>  - Normal 64-bit protected mode.  This should also already work.  (I
> >> forget whether a 16-bit SS is either illegal or has no effect in this
> >> case.)
> >
> > For these two cases I am just taking the effective address that the user
> > space application provides, given that the segment selectors were set
> > beforehand (and with a base of 0).
> 
> What do you mean by the base being zero?  User code can set a nonzero
> DS base if it wants.  In 64-bit mode (user_64bit_mode(regs)), the base
> is ignored unless there's an fs or gs prefix, and in 32-bit mode the
> base is never ignored.

Yes, I take this back. At the time of writing I was thinking about the
__USER_CS and _USER_DS descriptors. You ar right, the base is not
ignored.
> 
> >
> >>  - Virtual 8086 mode
> >
> > In this case I calculate the linear address as:
> >      (segment_select << 4) + effective address.
> >
> >>  - Normal 16-bit protected mode, used by DOSEMU and Wine.  (16-bit CS,
> >> 16-bit address segment)
> >>  - 16-bit CS, 32-bit address segment.  IIRC this might be used by some
> >> 32-bit DOS programs to call BIOS.
> >>  - 32-bit CS, 16-bit address segment.  I don't know whether anything uses this.
> >
> > In all these protected modes, are you referring to the size in bits of
> > the base address of in the descriptor selected in the CS register? In
> > such a case I would need to get the base address and add it to the
> > effective address given in the operands of the instructions, right?
> 
> No, I'm referring to the D/B bit.  I'm a bit fuzzy on exactly how the
> instruction encoding works, but I think that 16-bit x86 code is
> encoded just like real mode code except that the selectors are used
> for real.

I see. I have the logic to differentiate between 16-bit and 32-bit
addresses. What I am missing is code to look at the value of this bit
and any potential operand overrides. I will work on adding this.
> 
> >> size, but I suspect you'll need to handle 16-bit CS.
> >
> > Unless I am missing what is special with the 16-bit base address, I only
> > would need to add that base address to whatever effective address (aka,
> > offset) is encoded in the ModRM and displacement bytes.
> 
> Exactly.  (And make sure the instruction decoder can decode 16-bit
> instructions correctly.)

Will do. I have tested my 16-bit decoding extensively. I think it will
work.

Thanks and BR,
Ricardo

--
To unsubscribe from this list: send the line "unsubscribe linux-msdos" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html