Re: [PATCH v2 1/6] arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB

Doug Anderson <dianders@xxxxxxxxxxxx> · Wed, 18 Dec 2024 09:38:15 -0800

Hi,

On Wed, Dec 18, 2024 at 8:36 AM Julius Werner <jwerner@xxxxxxxxxxxx> wrote:
>
> > Given that I'm not going to change the way the existing predicates
> > work, if I move the "fallback" setting `max_bhb_k` to 32 to
> > spectre_bhb_enable_mitigation() then when we set `max_bhb_k` becomes
> > inconsistent between recognized and unrecognized CPUs.
>
> A clean way to fix that could be to change spectre_bhb_loop_affected()
> to just return the K-value (rather than change max_bhb_k directly),
> and then set max_bhb_k to the max() of that return value and the
> existing value when it is called from spectre_bhb_enable_mitigation().
> That way, max_bhb_k would only be updated from
> spectre_bhb_enable_mitigation().

Sure, you could do that. ...but in my patch series I need to add
_another_ static boolean that's updated in is_spectre_bhb_affected()
anyway. I need to add one to the new is_spectre_bhb_safe() function
that you requested. Specifically, the moment I detect any CPU ID
that's not in the "safe" list then I need to note that down. If I
don't do that then later calls to
is_spectre_bhb_affected(SCOPE_SYSTEM) will return the wrong value. So
while all the other "static" caches in is_spectre_bhb_affected() could
be removed because we changed the default return to "true", the one in
is_spectre_bhb_safe() (which causes the function to return "false")
can't be removed.

In any case, the predicates updating the static caches predates my
series and IMO my series doesn't make it worse. If you want to post a
followup series changing how the predicates work and can convince
others that it's worth doing then great, but I don't think it should
block forward progress here.

> > I would also say that having `max_bhb_k` get set in an inconsistent
> > place opens us up for bugs in the future. Even if it works today, I
> > imagine someone could change things in the future such that
> > spectre_bhb_enable_mitigation() reads `max_bhb_k` and essentially
> > caches it (maybe it constructs an instruction based on it). If that
> > happened things could be subtly broken for the "unrecognized CPU" case
> > because the first CPU would "cache" the value without it having been
> > called on all CPUs.
>
> This would likely already be a problem with the current code, since
> spectre_bhb_enable_mitigations() is called one at a time for each CPU
> and the max_bhb_k value is only valid once it has been called on all
> CPUs. If someone tried to just immediately use the value inside the
> function that would just be a bug either way. (Right now this should
> not be a problem because max_bhb_k is only used by
> spectre_bhb_patch_loop_iter() which ends up being called through
> apply_alternatives_all(), which should only run after all those CPU
> errata cpu_enable callbacks have been called.)

Actually, today is_spectre_bhb_affected() is called much earlier I
believe. It's installed (via cpu_errata.c) and called like this:

  is_spectre_bhb_affected+0x124/0x2d8
  update_cpu_capabilities+0xa0/0x158
  setup_boot_cpu_capabilities+0x20/0x40
  setup_boot_cpu_features+0x38/0x50
  smp_prepare_boot_cpu+0x38/0x60
  start_kernel+0x90/0x438

...but then spectre_bhb_enable_mitigations() is called later and by
that time is_spectre_bhb_affected() should have been called for each
of the CPUs:

  spectre_bhb_enable_mitigation+0xbc/0x340
  cpu_enable_non_boot_scope_capabilities+0x74/0xc8
  multi_cpu_stop+0xf0/0x1b8
  cpu_stopper_thread+0xac/0x148
  smpboot_thread_fn+0xb0/0x238

I agree that it doesn't seem to be a problem today, though.

-Doug