Re: [RFC PATCH 0/9] patchable function pointers for pluggable crypto routines

Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> · Fri, 5 Oct 2018 19:28:12 +0200



On 5 October 2018 at 19:26, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> On Fri, Oct 5, 2018 at 10:15 AM Ard Biesheuvel
> <ard.biesheuvel@xxxxxxxxxx> wrote:
>>
>> On 5 October 2018 at 15:37, Jason A. Donenfeld <Jason@xxxxxxxxx> wrote:
>> ...
>> > Therefore, I think this patch goes in exactly the wrong direction. I
>> > mean, if you want to introduce dynamic patching as a means for making
>> > the crypto API's dynamic dispatch stuff not as slow in a post-spectre
>> > world, sure, go for it; that may very well be a good idea. But
>> > presenting it as an alternative to Zinc very widely misses the point and
>> > serves to prolong a series of bad design choices, which are now able to
>> > be rectified by putting energy into Zinc instead.
>> >
>>
>> This series has nothing to do with dynamic dispatch: the call sites
>> call crypto functions using ordinary function calls (although my
>> example uses CRC-T10DIF), and these calls are redirected via what is
>> essentially a PLT entry, so that we can supsersede those routines at
>> runtime.
>
> If you really want to do it PLT-style, then just do:
>
> extern void whatever_func(args);
>
> Call it like:
> whatever_func(args here);
>
> And rig up something to emit asm like:
>
> GLOBAL(whatever_func)
>   jmpq default_whatever_func
> ENDPROC(whatever_func)
>
> Architectures without support can instead do:
>
> void whatever_func(args)
> {
>   READ_ONCE(patchable_function_struct_for_whatever_func->ptr)(args);
> }
>
> and patch the asm function for basic support.  It will be slower than
> necessary, but maybe the relocation trick could be used on top of this
> to redirect the call to whatever_func directly to the target for
> architectures that want to squeeze out the last bit of performance.
> This might actually be the best of all worlds: easy implementation on
> all architectures, no inline asm, and the totally non-magical version
> works with okay performance.
>
> (Is this what your code is doing?  I admit I didn't follow all the way
> through all the macros.)

Basically