On 6/21/24 08:45, Peter Xu wrote: > On Fri, Jun 21, 2024 at 07:51:26AM -0700, Dave Hansen wrote: ... >> But, still, what if you take a Dirty=1,Write=1 pud and pud_modify() it >> to make it Dirty=1,Write=0? What prevents that from being >> misinterpreted by the hardware as being a valid 1G shadow stack mapping? > > Thanks for pointing that out. I think I was thinking it will only take > effect on VM_SHADOW_STACK first, so it's not? > > I was indeed trying to find more information on shadow stack at that time > but I can't find as much on the pgtable implications, on e.g. whether "D=1 > + W=0" globally will be recognized as shadow stack. At least on SDM March > 2024 version Vol3 Chap4 pgtable entries still don't explain these details, > or maybe I missed it. Please let me know if there's suggestion on what I > can read before I post a v2. It's in the "Determination of Access Rights" section. A linear address is a shadow-stack address if the following are true of the translation of the linear address: (1) the R/W flag (bit 1) is 0 and the dirty flag (bit 6) is 1 in the paging- structure entry that maps the page containing the linear address; and (2) the R/W flag is 1 in every other paging- structure entry controlling the translation of the linear address. > So if it's globally taking effect, indeed we'll need to handle them in PUDs > too. > > Asides, not sure whether it's off-topic to ask here, but... why shadow > stack doesn't reuse an old soft-bit to explicitly mark "this is shadow > stack ptes" when designing the spec? Now it consumed bit 58 anyway for > caching dirty. IIUC we can avoid all these "move back and forth" issue on > dirty bit if so. The design accommodates "other" OSes that are using all the software bits for other things. For Linux, you're right, we just ended up consuming a software bit _anyway_ so we got all the complexity of the goofy permissions *AND* lost a bit in the end. Lose, lose. >>> /* >>> * mprotect needs to preserve PAT and encryption bits when updating >>> * vm_page_prot >>> @@ -1377,10 +1398,25 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, >>> } >>> #endif >>> >>> +static inline pud_t pudp_establish(struct vm_area_struct *vma, >>> + unsigned long address, pud_t *pudp, pud_t pud) >>> +{ >>> + if (IS_ENABLED(CONFIG_SMP)) { >>> + return xchg(pudp, pud); >>> + } else { >>> + pud_t old = *pudp; >>> + WRITE_ONCE(*pudp, pud); >>> + return old; >>> + } >>> +} >> >> Why is there no: >> >> page_table_check_pud_set(vma->vm_mm, pudp, pud); >> >> ? Sure, it doesn't _do_ anything today. But the PMD code has it today. >> So leaving it out creates a divergence that honestly can only serve to >> bite us in the future and will create a head-scratching delta for anyone >> that is comparing PUD and PMD implementations in the future. > > Good question, I really don't remember why I didn't have that, since I > should have referenced the pmd helper. I'll add them and see whether I'll > hit something otherwise. > > Thanks for the review. One big thing I did in this review was make sure that the PMD and PUD helpers were doing the same thing. Would you mind circling back and double-checking the same before you repost this?