On Wed, Jun 22, 2022 at 04:35:16PM +0800, Baoquan He wrote: > On 06/21/22 at 07:04pm, Catalin Marinas wrote: > > The problem with splitting is that you can end up with two entries in > > the TLB for the same VA->PA mapping (e.g. one for a 4KB page and another > > for a 2MB block). In the lucky case, the CPU will trigger a TLB conflict > > abort (but can be worse like loss of coherency). > > Thanks for this explanation. Is this a drawback of arm64 design? X86 > code do the same thing w/o issue, is there way to overcome this on > arm64 from hardware or software side? It is a drawback of the arm64 implementations. Having multiple TLB entries for the same VA would need additional logic in hardware to detect, so the microarchitects have pushed back. In ARMv8.4, some balanced was reached with FEAT_BBM so that the only visible side-effect is a potential TLB conflict abort that could be resolved by software. > I ever got a arm64 server with huge memory, w or w/o crashkernel setting > have different bootup time. And the more often TLB miss and flush will > cause performance cost. It is really a pity if we have very powerful > arm64 cpu and system capacity, but bottlenecked by this drawback. Is it only the boot time affected or the runtime performance as well? -- Catalin