On Sun, Aug 14, 2022 at 02:14:08PM -0700, Linus Torvalds wrote: > PeterZ - you've touched both the load_unaligned_zeropad() and the > exception code last, so let's run this past you, but this really does > seem to not only fix the code generation issue in fs/dcache.s, it just > looks simpler too. Comments? Ha, freshly back from vacation and I barely know what a computer is :-) > arch/x86/include/asm/extable_fixup_types.h | 2 ++ > arch/x86/include/asm/word-at-a-time.h | 50 +++--------------------------- > arch/x86/mm/extable.c | 30 ++++++++++++++++++ > 3 files changed, 37 insertions(+), 45 deletions(-) > > diff --git a/arch/x86/include/asm/extable_fixup_types.h b/arch/x86/include/asm/extable_fixup_types.h > index 503622627400..b53f1919710b 100644 > --- a/arch/x86/include/asm/extable_fixup_types.h > +++ b/arch/x86/include/asm/extable_fixup_types.h > @@ -64,4 +64,6 @@ > #define EX_TYPE_UCOPY_LEN4 (EX_TYPE_UCOPY_LEN | EX_DATA_IMM(4)) > #define EX_TYPE_UCOPY_LEN8 (EX_TYPE_UCOPY_LEN | EX_DATA_IMM(8)) (weird tab stuff there, but that's for another day I suppose) > +#define EX_TYPE_ZEROPAD 20 /* load ax from dx zero-padded */ > + > #endif > diff --git a/arch/x86/include/asm/word-at-a-time.h b/arch/x86/include/asm/word-at-a-time.h > index 8338b0432b50..4893f1b30dd6 100644 > --- a/arch/x86/include/asm/word-at-a-time.h > +++ b/arch/x86/include/asm/word-at-a-time.h > @@ -77,58 +77,18 @@ static inline unsigned long find_zero(unsigned long mask) > * and the next page not being mapped, take the exception and > * return zeroes in the non-existing part. > */ > static inline unsigned long load_unaligned_zeropad(const void *addr) > { > unsigned long ret; > > + asm volatile( > + "1: mov (%[addr]), %[ret]\n" > "2:\n" > + _ASM_EXTABLE_TYPE(1b, 2b, EX_TYPE_ZEROPAD) > + : [ret] "=a" (ret) > + : [addr] "d" (addr)); > > return ret; > } > > #endif /* _ASM_WORD_AT_A_TIME_H */ > diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c > index 331310c29349..58c79077a496 100644 > --- a/arch/x86/mm/extable.c > +++ b/arch/x86/mm/extable.c > @@ -41,6 +41,34 @@ static bool ex_handler_default(const struct exception_table_entry *e, > return true; > } > > +/* > + * This is the *very* rare case where we do a "load_unaligned_zeropad()" > + * and it's a page crosser into a non-existent page. > + * > + * This happens when we optimistically load a pathname a word-at-a-time > + * and the name is less than the full word and the next page is not > + * mapped. Typically that only happens for CONFIG_DEBUG_PAGEALLOC. > + * > + * NOTE! The load is always of the form "mov (%edx),%eax" to make the > + * fixup simple. So obviously we could use _ASM_EXTABLE_TYPE_REG together with something like: "mov (%[reg]), %[reg]" to not depend on these fixed registers, but yeah, that doesn't seem needed. Code-gen is fine as is. > + */ > +static bool ex_handler_zeropad(const struct exception_table_entry *e, > + struct pt_regs *regs, > + unsigned long fault_addr) > +{ > + const unsigned long mask = sizeof(long) - 1; > + unsigned long offset, addr; > + > + offset = regs->dx & mask; > + addr = regs->dx & ~mask; > + if (fault_addr != addr + sizeof(long)) > + return false; > + > + regs->ax = *(unsigned long *)addr >> (offset * 8); > + regs->ip = ex_fixup_addr(e); > + return true; I think the convention here is to do: return ex_handler_default(e, regs); instead, that ensures there a bit of common post code. > +} > + > static bool ex_handler_fault(const struct exception_table_entry *fixup, > struct pt_regs *regs, int trapnr) > { But yeah, looks good to me.