Re: [PATCH v6 0/2] x86: Implement fast refcount overflow protection

Kees Cook <keescook@xxxxxxxxxxxx> · Thu, 20 Jul 2017 15:53:25 -0700

On Thu, Jul 20, 2017 at 10:15 AM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> On Thu, Jul 20, 2017 at 2:11 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>> Could you please also create a tabulated quick-comparison of the three variants,
>> of all key properties, about behavior, feature and tradeoff differences?
>>
>> Something like:
>>
>>                                 !ARCH_HAS_REFCOUNT      ARCH_HAS_REFCOUNT=y     REFCOUNT_FULL=y
>>
>> avg fast path instructions:     5                       3                       10
>> behavior on overflow:           unsafe, silent          safe,   verbose         safe,   verbose
>> behavior on underflow:          unsafe, silent          unsafe, verbose         unsafe, verbose
>> ...
>>
>> etc. - note that this table is just a quick mockup with wild guesses. (Please add
>> more comparisons of other aspects as well.)
>>
>> Such a comparison would make it easier for arch, subsystem and distribution
>> maintainers to decide on which variant to use/enable.
>
> Sure, I can write this up. I'm not sure "safe"/"unsafe" is quite that
> clean. The differences between -full and -fast are pretty subtle, but
> I think I can describe it using the updated LKDTM tests I've written
> to compare the two. There are conditions that -fast doesn't catch, but
> those cases aren't actually useful for the overflow defense.
>
> As for "avg fast path instructions", do you mean the resulting
> assembly for each refcount API function? I think it's going to look
> something like "1   2   45", but I'll write it up.

So, doing a worst-case timing of a loop of inc() to INT_MAX and then
dec_and_test() back to zero, I see this out of perf:

atomic
25255.114805      task-clock (msec)
 82249267387      cycles
 11208720041      instructions

refcount-fast
25259.577583      task-clock (msec)
 82211446892      cycles
 15486246572      instructions

refcount-full
44625.923432      task-clock (msec)
144814735193      cycles
105937495952      instructions

I'll still summarize all this in the v7 series, but I think that
really clarifies the differences: 1.5x more instructions in -fast, but
nearly identical cycles and clock. Using -full sees a large change (as
expected).

-Kees

-- 
Kees Cook
Pixel Security