Re: [patch 125/166] lib/list: prevent compiler reloads inside 'safe' list iteration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 8, 2020 at 1:26 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
>
> OK, it sounds like we need to specify what needs to be present in this
> sort of commit log.

That would have helped.

But in this case, after having looked more at it, it wasn't that the
commit log was not sufficient, it was that the change was wrong.

With a better commit log and a pointer to the KCSAN report, I would
have been able to make a better judgment call earlier, and not have
had to ask for more information.

But once I figured out what the problem was, it was clear that

 (a) what the i915 driver does is at a minimum questionable, and quite
likely actively buggy

 (b) even if it's not buggy in the i915 driver, changing the list
traversal macros to use READ_ONCE() would hide other places where it
most definitely is buggy.

> o       The KCSAN splat, optionally including file/line numbers.

I do think that the KCSAN splat - very preferably simplified and with
explanations - should basically always be included if KCSAN was the
reason.

Of course, the "simplified and with explanations" may be simplified to
the point where none of the original splat even remains. Those splats
aren't so legible that anybody should feel like they should be
included.

Put another way: an analysis that is so thorough that it makes the raw
splat data pointless is a *good* thing, and if that means that no
splat is needed, all the better.

> o       Detailed description of the problem this splat identifies, for
>         example, the code might fail to acquire a necessary lock, a plain
>         load might vulnerable to compiler optimizations, and so on.

See above: if the description is detailed enough, then the splat
itself becomes much less interesting.

But in this case, for example, it's worth pointing out that the "safe"
list accessors are used all over the kernel, and they are very much
_not_ thread-safe. The "safe" in them is purely about the current
thread removing an entry, not some kind of general thread-safeness.

Which is why I decided I really hated that patch - it basically would
make KCSAN unable to find _real_ races elsewhere, because it hid a
very special race in the i915 driver.

So I started out with the "that can't be right" kind of general
feeling of uneasiness about the patch. I ended up with a much more
specific "no, that's very much not right" and no amount of commit log
should have saved it.

As mentioned, I suspect that the i915 driver could actually use the
RCU-safe list walkers. Their particular use is not about RCU per se,
but it has some very similar rules about walking a list concurrently
with somebody adding an entry to it.

And the RCU list walkers do end up doing special things to load the
next pointer.

Whether we want to say - and document - that "ok, the RCU list walkers
are also useful for this non-RCU use case" in general might be worth
discussing.

It may be that the i915 use case is so special that it's only worth
documenting there.

              Linus



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux