Re: [PATCH for-rc 2/2] IB/mlx5: Fix how advise_mr() launches async work

Moni Shoua <monis@xxxxxxxxxxxx> · Sun, 13 Jan 2019 19:19:22 +0200

On Sat, Jan 12, 2019 at 8:42 PM Jason Gunthorpe <jgg@xxxxxxxxxxxx> wrote:
>
> On Sat, Jan 12, 2019 at 11:06:51AM -0700, Parav Pandit wrote:
> >
> >
> > > From: Jason Gunthorpe
> > > Sent: Saturday, January 12, 2019 11:39 AM
> > > To: Parav Pandit <parav@xxxxxxxxxxxx>
> > > Cc: Leon Romanovsky <leon@xxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; Moni
> > > Shoua <monis@xxxxxxxxxxxx>
> > > Subject: Re: [PATCH for-rc 2/2] IB/mlx5: Fix how advise_mr() launches async
> > > work
> > >
> > > On Sat, Jan 12, 2019 at 10:32:54AM -0700, Parav Pandit wrote:
> > > >
> > > >
> > > > > From: linux-rdma-owner@xxxxxxxxxxxxxxx <linux-rdma-
> > > > > owner@xxxxxxxxxxxxxxx> On Behalf Of Jason Gunthorpe
> > > > > Sent: Saturday, January 12, 2019 11:06 AM
> > > > > To: Leon Romanovsky <leon@xxxxxxxxxx>
> > > > > Cc: linux-rdma@xxxxxxxxxxxxxxx; Moni Shoua <monis@xxxxxxxxxxxx>
> > > > > Subject: Re: [PATCH for-rc 2/2] IB/mlx5: Fix how advise_mr()
> > > > > launches async work
> > > > >
> > > > > On Sat, Jan 12, 2019 at 05:06:02PM +0200, Leon Romanovsky wrote:
> > > > > > On Fri, Jan 11, 2019 at 07:31:24PM -0700, Jason Gunthorpe wrote:
> > > > > > > From: Jason Gunthorpe <jgg@xxxxxxxxxxxx>
> > > > > > >
> > > > > > > Work must hold a kref on the ib_device otherwise the dev pointer
> > > > > > > can become free before the work runs.
> > > > > >
> > > > > > How can it occur exactly?
> > > > > >
> > > > > > "dev" will be released after mlx5_ib will exist, mlx5_ib exist
> > > > > > will drain advise_mr queue before it. Such drain will ensure that
> > > > > > no works are running.
> > > > >
> > > > > Okay, that is fine for the get_device, it can be removed.
> > > > >
> > > > > Are we OK with the reg_state hunk?
> > > > >
> > > > > Jason
> > > >
> > > > Unrelated to your question, I noticed that advise_mr does
> > > > uverbs_attr_ptr_get_array_size() for the SGEs.  I think this is
> > > > incorrect.  It should have been mr handles whose ownership can be
> > > > check against the PID or ucontext that registered those MRs.
> > >
> > > The SGE's have lkeys in them, and the lkeys are local to the PD passed in as
> > > argument.
> > >
> > > So the caller cannot touch resources outside the PD, hopefully the driver
> > > implementation checks this..
> > >
> > pagefault_single_data_segment () invokes __mlx5_mr_lookup() to
> > lookup a MR per device and not per PD.  mlx5_ib_advise_mr_prefetch()
> > doesn't touch pd.  work item doesn't have PD in it.
>
> Ugh, Moni?
>
> I don't think we should alow processes to page fault outside their PD,
> that sounds like an avenue for a side-channel security problem.
>

I can pass pd from prefetch_mr down to pagefault_single_data_segment()
and force that that lookup will be also for pd.
However there is another flow that goes through
pagefault_single_data_segment()  that starts from hardware page-fault.
What will be the pd in this case? Perhaps the pd of the
source/destination QP of the packet that generated the pagefault? But
I can do this
only if the type of page-fault is WQE. In this case the hardware
provides the QPN. For page-fault of type RDMA the hardware doesn't
provide the QPN, only the lkey, address and size (dev is the one that
fired the event).
So if there is a potential security risk in the advise_mr flow doesn't
it exist in the page-fault flow? A malicious packet can do the same
thing a malicious process does.
What do you think?

> > After that MR got reallocated to other process before work item can
> > do the work and now it fault's some other process.
>
> The locking around mr_srcu and MR lifetime is all broken, so this is
> not too surprising..
>
> > To avoid this, instead of overloading dev->mr_srcu, this work item
> > should be enqueued to advise_mr_wq and that should be flushed during
> > destroy_mr() which ensures that nothing dangling for this MR.  Such
> > code will create right synchronization without depending on
> > randomness of key, number of MR etc parameters.
>
> Moni?
>
> Jason