On 24/04/2019 06:57, Guo Ren wrote: > Hi Gary, > > On Wed, Apr 24, 2019 at 03:21:14AM +0000, Gary Guo wrote: >>> Look: >>> linux-next git:(riscv_asid_allocator_v2)$ grep GLOBAL arch/riscv -r >>> arch/riscv/include/asm/pgtable-bits.h:#define _PAGE_GLOBAL (1 << 5) /* >>> Global */ >>> arch/riscv/include/asm/pgtable-bits.h: _PAGE_USER | >>> _PAGE_GLOBAL)) >>> >>> Your patch tell us _PAGE_USER and _PAGE_GLOBAL are duplicate and why we >>> couldn't make _PAGE_USER implies _PAGE_GLOBAL? Can you give an example >>> of a real scene in PTE about: >>> _PAGE_USER:0 + _PAGE_GLOBAL:1 >>> or >>> _PAGE_USER:1 + _PAGE_GLOBAL:0 >>> >>> Of cause I know USER & GLOBAL are conceptually very different, but >>> there are only 10 attribute-bits for riscv (In fact we've wasted two bits >>> to support huge RV32-pfn :P). So I think it is time to merge these two bits >>> before hardware supports GLOBAL. Reserve them for future! >> >> Two cases I can think of: >> * vdso like things. They're user pages that can really be shared across address spaces (i.e. global). Kernels like L4 implement most systems calls similar to VDSO, so USER + GLOBAL is useful. > Vdso is a user space mapping in linux, See: fs/binfmt_elf.c > > static int load_elf_binary(struct linux_binprm *bprm) { > ... > #ifdef ARCH_HAS_SETUP_ADDITIONAL_PAGES > retval = arch_setup_additional_pages(bprm, !!elf_interpreter); > if (retval < 0) > goto out; > #endif /* ARCH_HAS_SETUP_ADDITIONAL_PAGES */ > > All linux archs use arch_setup_additional_pages for vdso mapping and > every process has its own vdso mapping to the same pages. But we shouldn't prevent a kernel from mapping a USER page globally. As I said, the fact that Linux doesn't do it isn't a valid reason for omitting the possibility. > > I don't think vdso is a real scene for GLOBAL in PTE. > >> * hypervisor without H-extension: This requires shadow page tables. Supervisor >> pages are mapped to supervisor shadow pages. However these shadow pages cannot >> be GLOBAL because they can't be shared between VMs. So !USER + !GLOBAL is useful. > Hypervisor use 2-stages TLB translation in hardware and shadow page > tables is for stage 2 translation. Shadow page tables care vmid not > asid. When H-extension is present, stage 2 translation uses VMID and is performed by hardware. When H-extension is not present, there's no such thing called VMID. When H-extension is not present, both hypervisor and guest supervisor will run in supervisor mode, and hypervisor uses MSTATUS.TVM to trap guest supervisor virtual memory operations. The shadow page table is populated by doing 2-stage page walk in software. In this case, the hypervisor likely needs to use some bits of ASID to emulate the VMID feature. In this case GLOBAL page cannot be used as it means that the page exists in all physical ASIDs (which contains both emulated VMID and ASID). Having supervisor pages being GLOBAL makes the semantics incorrect! > If hardware don't support H-extension (MMU 2-stages translation), it's > hard to accept for virtualization performance. The RISC-V privileged spec is explicitly designed to allow the techniques described above (this is the sole purpose of MSTATUS.TVM). It might be as high performance as a hardware with H-extension, but is definitely a legit use case. In fact, it is vital for use cases like recursive virtualization. Also, I believe the PTE format of RISC-V is already frozen -- therefore it is impossible now to merge GLOBAL and USER bit, nor to replace RSW bit with another bit. > > I don't think hypervisor is a real scene for GLOBAL in PTE. > > Are there other scene for GLOBAL in PTE? > > Best Regards > Guo Ren >