On 03/15/2010 12:54 PM, Antoine Leca wrote:
When doing switch, the cached segment selectors are preserved,
which allows one to use protected mode segments in real-address mode
(this is called unreal mode).
Now this is a by-product of the implementation inside the BIOS.
In fact, even if the BIOS enters unreal mode (or the similar big real,
more useful with segmentation-less architectures), before turning back
to the client it (should) reset things to normal real mode, as service
15/87 is not an usual way to enter unreal mode (for example, this effect
is not even mentionned in Ralf Brown's list).
The entry into unreal mode is unintentional; the bios is transitioning
to protected mode and 'unreal mode' only exists for a few instructions,
IIRC.
As a result (and also and foremost because of 80286 compatibility),
instead of directly using unreal or big real mode if possible (as done
eg. in himem.sys), Minix monitor goes to the great pain to going back to
square #1, and since blocks are at most 64 KB in size and several
iterations are needed, on the next block Minix sets up the (very
similar) GDT then does another call to the same BIOS service 15/87.
I knew these parts before, but this is where Avi's answer came in: KVM
on Intel does not yet support unreal mode and requires the cached
segment descriptors to be valid in real-address mode.
I do not know which virtual BIOS is using KVM, but I notice while
reading http://bochs.sourceforge.net/cgi-bin/lxr/source/bios/rombios.c:
[ Slightly edited to fit the width of my post. AL. ]
3555 case 0x87:
3556 #if BX_CPU< 3
3557 # error "Int15 function 87h not supported on< 80386"
3558 #endif
3559 // +++ should probably have descriptor checks
3560 // +++ should have exception handlers
...
3640 mov eax, cr0
3641 or al, #0x01
3642 mov cr0, eax
3643 ;; far jump to flush CPU queue after transition to prot. mode
3644 JMP_AP(0x0020, protected_mode)
3645
3646 protected_mode:
3647 ;; GDT points to valid descriptor table, now load SS, DS, ES
3648 mov ax, #0x28 ;; 101 000 = 5th desc.in table, TI=GDT,RPL=00
3649 mov ss, ax
3650 mov ax, #0x10 ;; 010 000 = 2nd desc.in table, TI=GDT,RPL=00
3651 mov ds, ax
3652 mov ax, #0x18 ;; 011 000 = 3rd desc.in table, TI=GDT,RPL=00
3653 mov es, ax
3654 xor si, si
3655 xor di, di
3656 cld
3657 rep
3658 movsw ;; move CX words from DS:SI to ES:DI
3659
3660 ;; make sure DS and ES limits are 64KB
3661 mov ax, #0x28
3662 mov ds, ax
3663 mov es, ax
3664
3665 ;; reset PG bit in CR0 ???
3666 mov eax, cr0
3667 and al, #0xFE
3668 mov cr0, eax
I should be loosing something here... There is no unreal mode at any
moment, is it?
[ ... some web browsing occuring meanwhile ... Later: ]
Okay, now I got another picture. 8-|
Until recently, KVM (and qemu) used Bochs BIOS, showed above; but they
switched recently to SeaBIOS... where the applicable code is in
src/system.c, and looks like (now this is AT&T assembly):
83 static void
84 handle_1587(struct bregs *regs)
85 {
86 // +++ should probably have descriptor checks
87 // +++ should have exception handlers
....
127 // Enable protected mode
128 " movl %%cr0, %%eax\n"
129 " orl $" __stringify(CR0_PE) ", %%eax\n"
130 " movl %%eax, %%cr0\n"
131
132 // far jump to flush CPU queue after transition to prot. mode
133 " ljmpw $(4<<3), $1f\n"
134
135 // GDT points to valid descriptor table, now load DS, ES
136 "1:movw $(2<<3), %%ax\n"
// 2nd descriptor in table, TI=GDT, RPL=00
137 " movw %%ax, %%ds\n"
138 " movw $(3<<3), %%ax\n"
// 3rd descriptor in table, TI=GDT, RPL=00
139 " movw %%ax, %%es\n"
140
141 // move CX words from DS:SI to ES:DI
142 " xorw %%si, %%si\n"
143 " xorw %%di, %%di\n"
144 " rep movsw\n"
145
146 // Disable protected mode
147 " movl %%cr0, %%eax\n"
148 " andl $~" __stringify(CR0_PE) ", %%eax\n"
149 " movl %%eax, %%cr0\n"
Note that while the basic scheme is the same, the "cleaning up" of lines
3660-3663 "make sure DS and ES limits are 64KB" is not present.
IIUC, the virtualized CPU goes back to real mode with those segments
sets as they are in protected mode, and yes with Minix boot monitor they
happenned to NOT be paragraph-aligned.
Is it possible to add back this "cleaning up" to the BIOS used in KVM?
I think so. This is a longstanding kvm bug, but I can't see any
downsides to a workaround in the BIOS.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html