Re: [IPoIB] Missing join mcast events causing full machine lockup

Doug Ledford <dledford@xxxxxxxxxx> · Tue, 02 Aug 2016 15:21:12 -0400

On Thu, 2016-07-21 at 10:31 +0300, Nikolay Borisov wrote:
> Hello, 
> 
> With running the risk of sounding like a broken record, I came
> across 
> another case where ipoib can cause the machine to go haywire due to 
> missed join requests. This is on 4.4.14 kernel. Here is what I
> believe 
> happens:

[ snip long traces ]

> This makes me wonder if using timeouts is actually better than
> blindly relying on completing the join.

Blindly relying on the join completions is not what we do.  We are very
careful to make sure we always have the right locking so that we never
leave a join request in the BUSY state without running the completion
at some time.  If you are seeing us do that, then it means we have a
bug in our locking or state processing.  The answer then is to find
that bug and not to paper over it with a timeout.  Can you find some
way to reproduce this with a 4.7 kernel?

>  So Doug, what would
> you say about the following as a proposed fix (not tested):
> 
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> index 87799de90a1d..f6f15d36b02d 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
> @@ -947,7 +947,7 @@ void ipoib_mcast_restart_task(struct work_struct
> *work)
>          */
>         list_for_each_entry_safe(mcast, tmcast, &remove_list, list)
>                 if (test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags))
> -                       wait_for_completion(&mcast->done);
> +                       wait_for_completion_timeout(&mcast->done, 30
> * HZ);
>  
>         list_for_each_entry_safe(mcast, tmcast, &remove_list, list) {
>                 ipoib_mcast_leave(mcast->dev, mcast);
> 
> Given the loop afterwards which uses ipoib_mcast_(leave_free) that
> should work?
> Looking at the code in ipoib_mcast_leave it seems we are going to
> trigger a warning, 
> which is preferable to putting the machine to a grinding halt? 
> 
> Does the proposed patch break things horribly ?

It violates the intent of the join processing.  And if we have the
problem you are seeing, we really need to know if it's broken in IPoIB
or deeper down in the core portion of the stack.  Breaking out and
continuing might be OK, but if we do, we are likely going to either
leak something or have a use-after-free or something like that, so I
would have to spend some time thinking about how things might go wrong
and whether or not it's better to stop the machine when this happens,
or continue and hope we don't corrupt memory somehow.

-- 
Doug Ledford <dledford@xxxxxxxxxx>
              GPG KeyID: 0E572FDD
Attachment:
signature.asc

Description: This is a digitally signed message part