> On Aug 23, 2019, at 5:59 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > > On Wed, 21 Aug 2019, Thomas Gleixner wrote: >> On Wed, 21 Aug 2019, Song Liu wrote: >>>> On Aug 20, 2019, at 1:23 PM, Song Liu <songliubraving@xxxxxx> wrote: >>>> >>>> Before 32-bit support, pti_clone_pmds() always adds PMD_SIZE to addr. >>>> This behavior changes after the 32-bit support: pti_clone_pgtable() >>>> increases addr by PUD_SIZE for pud_none(*pud) case, and increases addr by >>>> PMD_SIZE for pmd_none(*pmd) case. However, this is not accurate because >>>> addr may not be PUD_SIZE/PMD_SIZE aligned. >>>> >>>> Fix this issue by properly rounding up addr to next PUD_SIZE/PMD_SIZE >>>> in these two cases. >>> >>> After poking around more, I found the following doesn't really make >>> sense. >> >> I'm glad you figured that out yourself. Was about to write up something to >> that effect. >> >> Still interesting questions remain: >> >> 1) How did you end up feeding an unaligned address into that which points >> to a 0 PUD? >> >> 2) Is this related to Facebook specific changes and unlikely to affect any >> regular kernel? I can't come up with a way to trigger that in mainline >> >> 3) As this is a user page table and the missing mapping is related to >> mappings required by PTI, how is the machine going in/out of user >> space in the first place? Or did I just trip over what you called >> nonsense? > > And just because this ended in silence I looked at it myself after Peter > told me that this was on a kernel with PTI disabled. Aside of that my built > in distrust for debug war stories combined with fairy tale changelogs > triggered my curiousity anyway. I am really sorry that I was silent. Somehow I didn't see this in my inbox (or it didn't show up until just now?). For this patch, I really messed up this with something else. The issue we are seeing is that kprobe on CONFIG_KPROBES_ON_FTRACE splits PMD located at 0xffffffff81a00000. I sent another patch last night, but that might not be the right fix either. I haven't started testing our PTI enabled kernel, so I am not sure whether there is really an issue with the PTI code. Thanks, Song