Re: [patch 125/166] lib/list: prevent compiler reloads inside 'safe' list iteration

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Tue, 7 Apr 2020 10:28:34 -0700

On Tue, Apr 7, 2020 at 9:04 AM Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote:
>
> It'll take some time to reconstruct the original report, but the case in
> question was in removing the last element of the list of the last list,
> switch to a global lock over all such lists to park the HW, which in
> doing so added one more element to the original list. [If we happen to
> be retiring along the kernel timeline in the first place.]

Please point to the real code and the list.

Honestly, what you describe sounds complex enough that I think your
locking is simply just buggy.

IOW, this patch seems to really just paper over a locking bug, and the
KASAN report tied to it.

Because the fundamental issue is that READ_ONCE() can not fix a bug
here. Reading the next pointer once fundamentally cannot matter if it
can change concurrently: the code is buggy, and the READ_ONCE() just
means that it gets one or the other value randomly, and that the list
walking is fundamentally racy.

One the other hand, if the next pointer _cannot_ change concurrently,
then READ_ONCE() cannot make a difference.

So as fat as I can tell, we have two possibilities, and in both cases
changing the code to use READ_ONCE() is not the right thing to do. In
one case it hides a bug, and in another case it's just pointless.

> list->next changed from pointing to list_head, to point to the new
> element instead. However, we don't have to check the next element yet
> and want to terminate the list iteration.

I'd really like to see the actual code that has that list walking. You say:

> For reference,
> drivers/gpu/drm/i915/gt/intel_engine_pm.c::__engine_park()

.. but that function doesn't have any locking or list-walking. Is it
the "call_idle_barriers()" loop? What is it?

I'd really like to see the KASAN report and the discussion about this change.

And if there was no discussion, then the patch just seems like "I
changed code randomly and the KASAN report went away, so it's all
good".

> Global activity is serialised by engine->wakeref.mutex; every active
> timeline is required to hold an engine wakeref, but retiring is local to
> timelines and serialised by their own timeline->mutex.
>
> lock(&timeline->lock)
> list_for_each_safe(&timeline->requests)
>   \-> i915_request_retire [list_del(&timeline->requests)]
>    \-> intel_timeline_exit
>     \-> lock(&engine->wakeref.mutex)
>         engine_park [list_add_tail(&engine->kernel_context->timeline->requests)]

in that particular list_for_each_safe() thing, there's no possibility
that the 'next' field would be reloaded, since the list_del() in the
above will be somethign the compiler is aware of.

So yes, the beginning list_for_each_safe() might load it twice (or a
hundred times), but by the time that list_del() in
i915_request_retire() has been called, if the compiler then reloads it
afterwards, that would be a major compiler bug, since it's after that
value could have been written in the local thread.

So this doesn't explain it to me.

What it *sounds* like is that the "engine" lock that you do *not* hold
initially, is not protecting some accessor to that list, so you have a
race on the list at the time of that list_del().

And that race may be what KASAN is reporting, and what that patch is
_hiding_ from KASAN - but not fixing.

See what I am saying and why I find this patch questionable?

There may be something really subtle going on, but it really smells
like "two threads are modifying the same list at the same time".

And there's no way that the READ_ONCE() will fix that bug, it will
only make KASAN shut up about it.

                  Linus