Re: Some words of encouragement

David Given <dg@xxxxxxxxxxx> · Sat, 25 Feb 2012 12:53:31 +0000

On 25/02/12 06:27, Brad Normand wrote:
[...]
> Anything that targets "normal" x86-16 with bigger than 64KB data
> chunks needs to understand arithmetic on far addresses, and I wonder
> how hard it would be to do some sort of virtual linearization here,
> basically instead of a far pointer being effectively 20 bits of
> address inside 32 bits of storage, just make them flat 32 bits.

Well, Watcom does support this --- it's called HUGE model.

Here's a simple memcpy implementation:

void copy(char* dest, const char* src, int length)
{
	while (length--)
		*dest++ = *src++;
}

Here's the small mode version:

0000                          @copy:
0000    56                        push        si
0001    57                        push        di
0002    89 C7                     mov         di,ax
0004    89 D6                     mov         si,dx
0006                          L$1:
0006    4B                        dec         bx
0007    83 FB FF                  cmp         bx,0xffff
000A    74 08                     je          L$2
000C    8A 04                     mov         al,byte ptr [si]
000E    88 05                     mov         byte ptr [di],al
0010    46                        inc         si
0011    47                        inc         di
0012    EB F2                     jmp         L$1
0014                          L$2:
0014    5F                        pop         di
0015    5E                        pop         si
0016    C3                        ret

And here's the huge mode version:

0000                          @copy:
0000    56                        push        si
0001    57                        push        di
0002    55                        push        bp
0003    89 E5                     mov         bp,sp
0005    83 EC 02                  sub         sp,0x0002
0008    C4 7E 0A                  les         di,dword ptr 0xa[bp]
000B    C5 76 0E                  lds         si,dword ptr 0xe[bp]
000E    89 46 FE                  mov         word ptr -0x2[bp],ax
0011                          L$1:
0011    FF 4E FE                  dec         word ptr -0x2[bp]
0014    83 7E FE FF               cmp         word ptr -0x2[bp],0xffff
0018    74 2B                     je          L$2
001A    8A 04                     mov         al,byte ptr [si]
001C    26 88 05                  mov         byte ptr es:[di],al
001F    89 F0                     mov         ax,si
0021    8C DA                     mov         dx,ds
0023    BB 01 00                  mov         bx,0x0001
0026    31 C9                     xor         cx,cx
0028    9A 00 00 00 00            call        __PIA
002D    89 C6                     mov         si,ax
002F    8E DA                     mov         ds,dx
0031    89 F8                     mov         ax,di
0033    8C C2                     mov         dx,es
0035    BB 01 00                  mov         bx,0x0001
0038    31 C9                     xor         cx,cx
003A    9A 00 00 00 00            call        __PIA
003F    89 C7                     mov         di,ax
0041    8E C2                     mov         es,dx
0043    EB CC                     jmp         L$1
0045                          L$2:
0045    89 EC                     mov         sp,bp
0047    5D                        pop         bp
0048    5F                        pop         di
0049    5E                        pop         si
004A    CA 08 00                  retf        0x0008

So two and a half times the size *and* it's having to call off to an
external routine to do pointer arithmetic. But you do get standard
32-bit pointer semantics with arbitrary sized data structures.

There's a compromise, large mode, where the programmer promises that no
single data structure is bigger than 64kB. This means that it can
represent any pointer as a segment+offset pair, and do sane pointer
arithmetic with just the offset, which is much cheaper; the large mode
version of the above is only 30 bytes.

[...]
> This sounds like it'd be a good way to help ferret out bcc compiler
> bugs or bypass them entirely, plus from what I've read, Watcom is one
> of the better compilers to target 16 bit x86 in general.  There may be
> some hints in FreeDOS, as openwatcom is one of the supported compilers
> for the kernel there.

Unfortunately it seems that OMF object files can't represent pointer
differences. Which means I can't do this to emit the ELKS executable header:

	dw __tend, 0           ; size of text segment in bytes
	dw _edata, 0           ; size of data segment in bytes
	dw _end - _edata, 0    ; size of bss segment in bytes
	dw _cstart_, 0         ; entry point
	dw 65535, 0            ; chmem
	dw 0, 0                ; size of symbol table

The '_end - _edata' is silently accepted by wasm but evaluates to 0.
Which is nice. nasm was more informative (and has a saner syntax; I'd
forgotten how loathesome masm syntax is).

I'm now thinking that the sanest way to go here is (a) hack Watcom to
support ELKS executables directly; (b) write a tool to disassemble the
OMF output and convert it to as86-compatible format; (c) give up and go
to the pub...

-- 
┌─── ｄｇ＠ｃｏｗｌａｒｋ．ｃｏｍ ───── http://www.cowlark.com ─────
│
│ "Never attribute to malice what can be adequately explained by
│ stupidity." --- Nick Diamos (Hanlon's Razor)

Attachment:
signature.asc

Description: OpenPGP digital signature