Possible use_mm() mis-uses

torvalds@xxxxxxxxxxxxxxxxxxxx (Linus Torvalds) · Wed, 22 Aug 2018 12:58:04 -0700

On Wed, Aug 22, 2018 at 12:37 PM Oded Gabbay <oded.gabbay at gmail.com> wrote:
>
> Having said that, I think we *are* protected by the mmu_notifier
> release because if the process suddenly dies, we will gracefully clean
> the process's data in our driver and on the H/W before returning to
> the mm core code. And before we return to the mm core code, we set the
> mm pointer to NULL. And the graceful cleaning should be serialized
> with the load_hqd uses.

So I'm a bit nervous about the mmu_notifier model (and the largely
equivalent exit_aio() model for the USB gardget AIO uses).

The reason I'm nervous about it is that the mmu_notifier() gets called
only after the mm_users count has already been decremented to zero
(and the exact same thing goes for exit_aio()).

Now that's fine if you actually get rid of all accesses in
mmu_notifier_release() or in exit_aio(), because the page tables still
exist at that point - they are in the process of being torn down, but
they haven't been torn down yet.

But for something like a kernel thread doing use_mm(), the thing that
worries me is a pattern something like this:

  kwork thread          exit thread
  --------              --------

                        mmput() ->
                          mm_users goes to zero

  use_mm(mmptr);
  ..

                          mmu_notifier_release();
                          exit_mm() ->
                            exit_aio()

and the pattern is basically the same regatdless of whether you use
mmu_notifier_release() or depend on some exit_aio() flushing your aio
work: the use_mm() can be called with a mm that has already had its
mm_users count decremented to zero, and that is now scheduled to be
free'd.

Does it "work"? Yes. Kind of. At least if the mmu notifier and/or
exit_aio() actually makes sure to wait for any kwork thread thing. But
it's a bit of a worrisome pattern.

           Linus