On Wed, May 20, 2020 at 12:13 PM Vivek Goyal <vgoyal@xxxxxxxxxx> wrote: > > On Tue, May 19, 2020 at 03:12:42PM -0700, Dan Williams wrote: > > The original copy_mc_fragile() implementation had negative performance > > implications since it did not use the fast-string instruction sequence > > to perform copies. For this reason copy_mc_to_kernel() fell back to > > plain memcpy() to preserve performance on platform that did not indicate > > the capability to recover from machine check exceptions. However, that > > capability detection was not architectural and now that some platforms > > can recover from fast-string consumption of memory errors the memcpy() > > fallback now causes these more capable platforms to fail. > > > > Introduce copy_mc_generic() as the fast default implementation of > > copy_mc_to_kernel() and finalize the transition of copy_mc_fragile() to > > be a platform quirk to indicate 'fragility'. With this in place > > copy_mc_to_kernel() is fast and recovery-ready by default regardless of > > hardware capability. > > > > Thanks to Vivek for identifying that copy_user_generic() is not suitable > > as the copy_mc_to_user() backend since the #MC handler explicitly checks > > ex_has_fault_handler(). > > /me is curious to know why #MC handler mandates use of _ASM_EXTABLE_FAULT(). Even though we could try to handle all faults / exceptions generically, I think it makes sense to enforce type safety here if only to support architectures that can only satisfy the minimum contract of copy_mc_to_user(). For example, if there was some destination exception other than #PF the contract implied by copy_mc_to_user() is that exception is not intended to be permissible in this path. See: 00c42373d397 x86-64: add warning for non-canonical user access address dereferences 75045f77f7a7 x86/extable: Introduce _ASM_EXTABLE_UA for uaccess fixups ...for examples of other justification for being explicit in these paths. > > [..] > > +/* > > + * copy_mc_generic - memory copy with exception handling > > + * > > + * Fast string copy + fault / exception handling. If the CPU does > > + * support machine check exception recovery, but does not support > > + * recovering from fast-string exceptions then this CPU needs to be > > + * added to the copy_mc_fragile_key set of quirks. Otherwise, absent any > > + * machine check recovery support this version should be no slower than > > + * standard memcpy. > > + */ > > +SYM_FUNC_START(copy_mc_generic) > > + ALTERNATIVE "jmp copy_mc_fragile", "", X86_FEATURE_ERMS > > + movq %rdi, %rax > > + movq %rdx, %rcx > > +.L_copy: > > + rep movsb > > + /* Copy successful. Return zero */ > > + xorl %eax, %eax > > + ret > > +SYM_FUNC_END(copy_mc_generic) > > +EXPORT_SYMBOL_GPL(copy_mc_generic) > > + > > + .section .fixup, "ax" > > +.E_copy: > > + /* > > + * On fault %rcx is updated such that the copy instruction could > > + * optionally be restarted at the fault position, i.e. it > > + * contains 'bytes remaining'. A non-zero return indicates error > > + * to copy_safe() users, or indicate short transfers to > > copy_safe() is vestige of terminology of previous patches? Thanks, yes, I missed this one. > > > + * user-copy routines. > > + */ > > + movq %rcx, %rax > > + ret > > + > > + .previous > > + > > + _ASM_EXTABLE_FAULT(.L_copy, .E_copy) > > A question for my education purposes. > > So copy_mc_generic() can handle MCE both on source and destination > addresses? (Assuming some device can generate MCE on stores too). There's no such thing as #MC on write. #MC is only signaled on consumed poison. In this case what is specifically being handled is #MC with RIP pointing at a movq instruction. The fault handler actually does not know anything about source or destination, it just knows fault / exception type and the register state. > On the other hand copy_mc_fragile() handles MCE recovery only on > source and non-MCE recovery on destination. No, there's no difference in capability. #MC can only be raised on a poison-read in both cases.