On Tue, 2005-06-28 at 21:26 -0400, Peter Arremann wrote: > They all have 0 to do with the problem? > What kind of document would you accept? One that you will _not_ find on developer.intel.com. Intel will _not_ tell you how to hack the Athlon MP so address PAE36 linearly at the TLB, because it's processors can't do it prior to the Xeon MP with EM64T. > Alright - tell me what you want to see :-) It's somewhere in the AMD system developer manuals, possibly ones not publicly available. We're not talking board-level and we're not talking programmer either. We're talking about throwing the so-called "32-bit" Athlon [MP] in a mode that _breaks_ GTL, something that would _not_ normally be supported. > Thank you - 32bit only please... we all agree that AMD64 can address > more than 4GB without issues. This is a hack for so-called "32-bit" Athlon MP mainboards! The so-called "32-bit" Athlon and so-called "64-bit" Athlon 64 / Opteron are of the _same_core_ design for the _same_ platform, EV6. I posted on the 3-generation "heritage" of Intel (i386-486, Pentium, PPro-P4) and AMD (386-486, Nx586-686/K5-K6, Athlon-Opteron). You don't design cores "on-a-dime," but for 5-7 years of lifespan (although the PPro is really old, largely because Itanium _was_ Intel's 4th gen design!). Athlon 64 / Opteron just moves more of the "traditional northbridge" into the CPU, doubles the XMM registers and makes the ALU fully 64-bit. Most of the "northbridge" changes were already in the so-called "32-bit" Athlon, because EV6 is a 3-16 point _crossbar_ switch, not a "hub" like Intel. Because the CPUs in Athlon MP talk over _separate_, _switched_ interconnects, they must have some management units in the processors, not the "single point-of-contention chipset." A64/Opteron merely turns this into a "partial mesh" instead of a "single switch." I.e., instead of "switching" in the "single chipset," you now "switch" in the individual CPUs. The CPUs _always_ acted "independent" -- even in Athlon MP, right into the EV6 switch. The addressing is still 100% the same! Even the addressing registers -- 16-bit segment + 32-bit offset are the _same_! There is just now an official memory model called "Long Mode" -- the segment register becomes bits 32-47. In PAE36, the segment register is bits 4-36, which bits 4-31 being a "two's complement" with the offset register. Now that's just the "programmer" level. GTL was built for 32-bit. It made _no_sense_ for Intel to modify GTL until recently, because the underlying PAE36 model required paging in the OS anyway. I.e., why add all the logic to do linear addressing in GTL beyond 32-bit if there was no OS to do it?!?!?! Besides, IA-64 was the future, right? [ Interconnects and memory addressing are _not_ things you can "do on a dime." It took Intel years to develop GTL, and Digital years to develop EV6. And it took years for AMD to adopt EV6 for GTL compatibility. ] Athlon, including 32-bit Athlon, was AMD's first design that was _not_ GTL compatible at all! That means AMD had to add all sorts of GTL compatibility in to the chipset, CPU, etc... Since they already had a 40-bit interconnect anyway, they decided to support legacy PAE36 GTL as well as 32-bit GTL. That way it could use legacy OSes up to 64GiB. When these legacy PAE36 OSes run, they use Athlon MP in the same way Intel does above 4GiB, paging. That was _until_ this "hack." It requires the BIOS to setup the EV6 interconnect in a way that _breaks_ GTL. That means the OS has got to know how to use it. Athlon MP mainboards with this hack are _rare_ (I'm still trying to find the e-mail which has this short-list). Now that x86-64 is here, Intel was _finally_ given a reason to make GTL work above 4GiB. They have now done so in the new 40-bit implementation that Xeon MP uses. Linux/x86-64 takes advantage of this. But on Linux/x86, when you break 4GiB, the paging must accommodate. What Intel doesn't have on its GTL/x86 that AMD/x86 does is a native, linear 40-bit TLB capability. Again, for Intel, it would have been a waste of transistors, because paging is how a 32-bit OS _must_ work for PAE36 -- or so it seemed. On AMD, they already had >32-bit to support the EV6 interconnect. EV6 was _not_ designed for x86, but AXP. _All_ EV6 components are 40-bit compatible, they have to be for the specification, including even the so-called "32-bit" Athlon interconnect logic. AMD had to add logic to support for GTL. They added PAE36 because they already had the address space to spare. _Every_other_ x86/PAE36 OS uses it with paging. This Linux hack is aware that the core TLB is designed for _linear_ >32-bit, when the hardware must be configured in such a way that is _completely_incompatible_ with GTL, including PAE36. Again, I'm waiting on the technical information from a foremost Linux source at AMD. He'll understand it better than I. -- Bryan P.S. There is this farce out there that AMD64 allows 64-bit addressing. It does _not_. It allows PAE52/4PiB, PAE36/64GiB and 32-bit/4GiB programmatic-virtual, 48-bit/256TiB programmatic-physical and 40- bit/1TiB interconnect-linear. When running a PAE36 OS, it will linearly address up to 36-bit/64GiB, using the native, linear EV6 interconnect. That was (essentially) backported with this hack to so-called "32-bit" Athlon, and implemented on a handful of Athlon MP mainboards. Intel x86/GTL+ is _not_ capable of this, because it _violates_ how the CPU GTL + talks to the MCH in Intel's own specs -- _until_ EM64T came out (and even then there are still some issues prior to the new 40-bit Xeon MP). P.S.S. The i486 TLB has _always_ been capable of 48-bit/256TiB "virtual addressing." It was just always "normalized" into 32-bit physical addresses. PAE36 just normalizes them into 36-bit physical addresses, although the PAE36 OSes still use a 32-bit offset register, which requires the "paging." What this "hack" does is take advantage of a non-GTL compatible mode of the Athlon, just like the A64/Opteron, and avoids paging at the TLB (IIRC). -- Bryan J. Smith b.j.smith@xxxxxxxx --------------------------------------------------------------------- It is mathematically impossible for someone who makes more than you to be anything but richer than you. Any tax rate that penalizes them will also penalize you similarly (to those below you, and then below them). Linear algebra, let alone differential calculus or even ele- mentary concepts of limits, is mutually exclusive with US journalism. So forget even attempting to explain how tax cuts work. ;->