Re: [PATCH] x86: enable Data Operand Independent Timing Mode

Dave Hansen <dave.hansen@xxxxxxxxx> · Fri, 3 Feb 2023 08:25:44 -0800

On 2/1/23 10:09, Dave Hansen wrote:
> Good questions.  I want to make sure what I'm saying here is accurate,
> and I don't have good answers to those right now.  We're working on it,
> and should have something useful to say in the next couple of days.

This is an attempt to make sure that everyone that is concerned about
DOITM behavior has all the same information as Intel folks before we
make a decision about a kernel implementation.

Here we go...

The execution latency of the DOIT instructions[1] does not depend on the
value of data operands on all currently-supported Intel processors. This
includes all processors that enumerate DOITM support.  There are no
plans for any processors where this behavior would change, despite the
DOITM architecture theoretically allowing it.

So, what's the point of DOITM in the first place?  Fixed execution
latency does not mean that programs as a whole will have constant
overall latency.  DOITM currently affects features which do not affect
execution latency but may, for instance, affect overall program latency
due to side-effects of prefetching on the cache.  Even with fixed
instruction execution latency, these side-effects can matter especially
to the paranoid.

Today, those affected features are:
 * Data Dependent Prefetchers (DDP)[2]
 * Some Fast Store Forwarding Predictors (FSFP)[3].

There are existing controls for those features, including
spec_store_bypass_disable=[4].  Some paranoid software may already have
mitigations in place that are a superset of DOITM.  In addition, both
DDP and FSFP were also designed to limit nastiness when crossing
privilege boundaries.  Please see the linked docs for more details.

That's basically the Intel side of things.  Everyone else should have
all the background that I have.  Now back to maintainer mode...

So, in short, I don't think everyone's crypto libraries are busted today
when DOITM=0.  I don't think they're going to _become_ busted any time soon.

Where do we go from here?  There are a few choices:

1. Apply the patch in this thread, set DOITM=1 always.  Today, this
   reduces exposure to DDP and FSFP, but probably only for userspace.
   It reduces exposure to any future predictors under the DOITM umbrella
   and also from Intel changing its mind.
2. Ignore DOITM, leave it to the hardware default of DOITM=0.  Today,
   this probably just steers folks to using relatively heavyweight
   mitigations (like SSBD) if they want DDP/FSFP disabled.  It also
   leaves Linux exposed to Intel changing its mind on its future plans.
3. Apply the patch in this thread, but leave DOITM=0 as the default.
   This lets folks enable DOITM on a moment's notice if the need arises.

There are some other crazier choices like adding ABI to toggle DOITM for
userspace, but I'm not sure they're even worth discussing.

#1 is obviously the only way to go if the DOITM architecture remains
as-is.  There is talk of making changes, like completely removing the
idea of variable execution latency.  But that's a slow process and would
be a non-starter if *anyone* (like other OSes) is depending on the
existing DOITM definition.

My inclination is to wait a couple of weeks to see which way DOITM is
headed and if the definition is likely to get changed.  Does anyone feel
any greater sense of urgency?

1.
https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/resources/data-operand-independent-timing-instructions.html
2.
https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/data-dependent-prefetcher.html
3.
https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/fast-store-forwarding-predictor.html
4. https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html