inline answer. On Thu, Aug 18, 2016 at 1:42 AM, Guillaume Nault <g.nault@xxxxxxxxxxxx> wrote: > On Tue, Aug 16, 2016 at 07:33:38PM +0800, fgao@xxxxxxxxxx wrote: >> From: Gao Feng <fgao@xxxxxxxxxx> >> >> PPP channel holds one spinlock before send frame. But the skb may >> select the same PPP channel with wrong route policy. As a result, >> the skb reaches the same channel path. It tries to get the same >> spinlock which is held before. Bang, the deadlock comes out. >> > Unless I misunderstood the problem you're trying to solve, this patch > doesn't really help: deadlock still occurs if the same IP is used for > L2TP and PPP's peer address. > The deadlock happens because the same cpu try to hold the spinlock which is already held before by itself. Now the PPP_CHANNEL_LOCK_BH sets the lock owner after hold lock, then when the same cpu invokes PPP_CHANNEL_LOCK_BH again. The cl.owner equals current cpu id, so it only increases the lock_cnt without trying to hold the lock again. So it avoids the deadlock. >> Now add one lock owner to avoid it like xmit_lock_owner of >> netdev_queue. Check the lock owner before try to get the spinlock. >> If the current cpu is already the owner, needn't lock again. When >> PPP channel holds the spinlock at the first time, it sets owner >> with current CPU ID. >> > I think you should forbid lock recursion entirely, and drop the packet > if the owner tries to re-acquire the channel lock. Otherwise you just > move the deadlock down the stack (l2tp_xmit_skb() can't be called > recursively). The reason that fix it in ppp is that there are other layer on the ppp module. We resolve it in ppp module, it could avoid all similar potential issues. > >> diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c >> index 70cfa06..6909ab1 100644 >> --- a/drivers/net/ppp/ppp_generic.c >> +++ b/drivers/net/ppp/ppp_generic.c >> @@ -162,6 +162,37 @@ struct ppp { >> |SC_MULTILINK|SC_MP_SHORTSEQ|SC_MP_XSHORTSEQ \ >> |SC_COMP_TCP|SC_REJ_COMP_TCP|SC_MUST_COMP) >> >> +struct chennel_lock { > ^ > I guess you meant 'channel_lock'. Yes. It is a typo. Thanks. > >> + spinlock_t lock; >> + u32 owner; > This field's default value is -1. So why not declaring it as 'int'? OK. I will fix it. Best Regards Feng -- To unsubscribe from this list: send the line "unsubscribe linux-ppp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html