Re: [PATCH v5 02/12] array_idx: sanitize speculative array de-references

Ingo Molnar <mingo@xxxxxxxxxx> · Wed, 31 Jan 2018 09:03:10 +0100

* Van De Ven, Arjan <arjan.van.de.ven@xxxxxxxxx> wrote:

> > > IOW, is there some work on tooling/analysis/similar? Not asking for
> > > near-term, but more of a "big picture" question..
> 
> short term there was some extremely rudimentary static analysis done. clearly 
> that has extreme limitations both in insane rate of false positives, and missing 
> cases.

What was the output roughly, how many suspect places that need array_idx_nospec() 
handling: a few, a few dozen, a few hundred or a few thousand?

I'm guessing 'static tool emitted hundreds of sites with many false positives 
included, but the real sites are probably a few dozen' - but that's really just a 
very, very wild guess.

Our whole mindset and approach with these generic APIs obviously very much depends 
on the order of magnitude:

- For array users up to a few dozen per 20 MLOC code base accumulated over 20
  years, and with no more than ~1 new site per kernel release we can probably
  do what we do for buffer overflows: static analyze and fix via 
  array_idx_nospec().

- If it's more than a few dozen then intuitively I'd also be very much in favor of
  compiler help: for example trickle down __user annotations that Sparse uses some
  more and let the compiler sanitize indices or warn about them - without hurting 
  performance of in-kernel array handling.

- If it's "hundreds" then probably both the correctness and the maintenance
  aspects won't scale well to a 20+ MLOC kernel code base - in that case we _have_
  to catch the out of range values at a much earlier stage, at the system call and
  other system ABI level, and probably need to do it via a self-maintaining 
  approach such as annotations/types that also warns about 'unsanitized' uses. But 
  filtering earlier has its own problems as well: mostly the layering violations 
  (high level interfaces often don't know the safe array index range) can create 
  worse bugs and more fragility than what is being fixed ...

- If it's "thousands" then it _clearly_ won't scale and we probably need compiler 
  help: i.e. a compiler that tracks ranges and automatically sanitizes indices. 
  This will have some performance effect, but should result in almost immediate 
  correctness.

Also, IMHO any tooling should very much be open source: this isn't the kind of bug 
that can be identified via code review, so there's no fall-back detection method 
like we have for buffer overflows.

Anyway, the approach taken very much depends on the order of magnitude of such 
affected array users we are targeting ...

Thanks,

	Ingo