RE: [PATCH v2] avoid race condition between start_xmit and cm_rep_handler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Leon Romanovsky [mailto:leon@xxxxxxxxxx]
> Sent: Sunday, August 19, 2018 10:29 AM
> To: Aaron Knister <aaron.s.knister@xxxxxxxx>
> Cc: linux-rdma@xxxxxxxxxxxxxxx; stable@xxxxxxxxxxxxxxx; Weiny, Ira
> <ira.weiny@xxxxxxxxx>; Fleck, John <john.fleck@xxxxxxxxx>
> Subject: Re: [PATCH v2] avoid race condition between start_xmit and
> cm_rep_handler
> 
> On Fri, Aug 17, 2018 at 05:31:26PM -0400, Aaron Knister wrote:
> > Inside of start_xmit() the call to check if the connection is up and
> > the queueing of the packets for later transmission is not atomic which
> > leaves a window where cm_rep_handler can run, set the connection up,
> > dequeue pending packets and leave the subsequently queued packets by
> > start_xmit() sitting on neigh->queue until they're dropped when the
> > connection is torn down. This only applies to connected mode. These
> > dropped packets can really upset TCP, for example,  and cause
> > multi-minute delays in transmission for open connections.
> >
> > Here's the code in start_xmit where we check to see if the connection
> > is up:
> >
> >        if (ipoib_cm_get(neigh)) {
> >                if (ipoib_cm_up(neigh)) {
> >                        ipoib_cm_send(dev, skb, ipoib_cm_get(neigh));
> >                        goto unref;
> >                }
> >        }
> >
> > The race occurs if cm_rep_handler execution occurs after the above
> > connection check (specifically if it gets to the point where it
> > acquires
> > priv->lock to dequeue pending skb's) but before the below code snippet
> > in start_xmit where packets are queued.
> >
> >        if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE)
> {
> >                push_pseudo_header(skb, phdr->hwaddr);
> >                spin_lock_irqsave(&priv->lock, flags);
> >                __skb_queue_tail(&neigh->queue, skb);
> >                spin_unlock_irqrestore(&priv->lock, flags);
> >        } else {
> >                ++dev->stats.tx_dropped;
> >                dev_kfree_skb_any(skb);
> >        }
> >
> > The patch re-checks ipoib_cm_up with priv->lock held to avoid this
> > race condition. Since odds are the conn should be up most of the time
> > (and thus the connection *not* down most of the time) we don't hold
> > the lock for the first check attempt to avoid a slowdown from
> > unecessary locking for the majority of the packets transmitted during
> > the connection's life.
> >
> > Signed-off-by: Aaron Knister <aaron.s.knister@xxxxxxxx>
> 
> If you want this fix in stable@, you will need to provide Fixes line here and
> don't add stable@ in CC-list.


Fixes: b63b70d87741 ("IPoIB: Use a private hash table for path lookup in xmit path")
Cc: stable@xxxxxxxxxxxxxxx


> 
> Thanks,
> Reviewed-by: Leon Romanovsky <leonro@xxxxxxxxxxxx>



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux