>>Ok earlier you said: >>"At more than 1GiB on Linux/x86, you must use a 4G+4G kernel >>(this is the default) to see more than 960MiB. This causes a >>signficant (10%+) performance hit. On more than 4GiB, it is >>worsened as more extensive paging is used." > > > Note I said "Linux/x86" and _not_ "Linux/x86-64". :) > > >>where does the performance hit for 4G/4G on Intel (whether >>ia32e or not) come from? > > > The performance hit is for _all_ IA-32 compatible architectures running > Linux/x86, because there is definitely a hit. > > There's a hit for the 4G+4G HIGHMEM model. > And there is another, bigger one if you go 64G model (more than 4GiB > user). > > As far as _both_ Intel IA-32 on Linux/x86 _and_ Intel IA-32e (EM64T) on > Linux/x86-64, you _always_ have "bounce buffers" (c/o the Soft I/O MMU, > Soft IOTLB in Linux/x86-64 on EM64T) if you are doing a transfer between > two memory areas -- e.g., user memory and memory mapped I/O -- when > _one_ area is above 4GiB. No way around that, and a major problem with > Intel right now. Right, so if I have 2G of RAM, I want 2G/2G (kernel/user) split instead of 1G/3G so that I don't have to turn on HIGHMEM and thus avoid the penalty of using HIGHMEM. > > x86-64 (AMD64) on Linux/x86-64 uses its I/O MMU hardware to drastically > improve the performance. There were a few bugs early on, but most of > them have been resolved. Does that mean that Linux on AMD64 does not do ZONE_NORMAL <-> ZONE_HIGHMEM buffering/paging?