Christophe Leroy <christophe.leroy@xxxxxxxxxx> writes: > Le 17/06/2020 à 16:38, Peter Zijlstra a écrit : >> On Thu, Jun 18, 2020 at 12:21:22AM +1000, Michael Ellerman wrote: >>> Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes: >>>> On Mon, Jun 15, 2020 at 12:57:59PM +0000, Christophe Leroy wrote: >> >>>>> +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES) >>>>> +#define __HAVE_ARCH_PTEP_GET >>>>> +static inline pte_t ptep_get(pte_t *ptep) >>>>> +{ >>>>> + pte_t pte = {READ_ONCE(ptep->pte), 0, 0, 0}; >>>>> + >>>>> + return pte; >>>>> +} >>>>> +#endif >>>> >>>> Would it make sense to have a comment with this magic? The casual reader >>>> might wonder WTH just happened when he stumbles on this :-) >>> >>> I tried writing a helpful comment but it's too late for my brain to form >>> sensible sentences. >>> >>> Christophe can you send a follow-up with a comment explaining it? In >>> particular the zero entries stand out, it's kind of subtle that those >>> entries are only populated with the right value when we write to the >>> page table. >> >> static inline pte_t ptep_get(pte_t *ptep) >> { >> unsigned long val = READ_ONCE(ptep->pte); >> /* 16K pages have 4 identical value 4K entries */ >> pte_t pte = {val, val, val, val); >> return pte; >> } >> >> Maybe something like that? > > This should work as well. Indeed nobody cares about what's in the other > three. They are only there to ensure that ptep++ increases the ptep > pointer by 16 bytes. Only the HW require 4 identical values, that's > taken care of in set_pte_at() and pte_update(). Right, but it seems less error-prone to have the in-memory representation match what we have in the page table (well that's in-memory too but you know what I mean). > So we should use the most efficient. Thinking once more, maybe what you > propose is the most efficient as there is no need to load another > register with value 0 in order to write it in the stack. On 64-bit I'd say it makes zero difference, the only thing that's going to matter is the load from ptep->pte. I don't know whether that's true on the 8xx cores though. cheers