Wei Yongjun wrote: > > Vlad Yasevich wrote: >> Wei Yongjun wrote: >> >>> sk->sk_data_ready() of sctp socket can be called from both BH and non-BH >>> contexts, but the default sk->sk_data_ready(), sock_def_readable(), can >>> not be used in this case. Therefore, we have to make a new function >>> sctp_data_ready() to grab sk->sk_data_ready() with BH disabling. >>> >>> >> Wouldn't the same inversion happen in TCP as well? TCP can call that >> function in _bh and user contexts as well. >> > > Not sure, but TCP does not call that function in user context at all. > Wei Can you trigger this problem with this patch applied? diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 3a95fcb..dabdc50 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -3717,9 +3717,9 @@ SCTP_STATIC int sctp_init_sock(struct sock *sk) sp->hmac = NULL; SCTP_DBG_OBJCNT_INC(sock); - percpu_counter_inc(&sctp_sockets_allocated); local_bh_disable(); + percpu_counter_inc(&sctp_sockets_allocated); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1); local_bh_enable(); -vlad >> -vlad >> >> >>> ========================================================= >>> [ INFO: possible irq lock inversion dependency detected ] >>> 2.6.33-rc6 #129 >>> --------------------------------------------------------- >>> sctp_darn/1517 just changed the state of lock: >>> (clock-AF_INET){++.?..}, at: [<c06aab60>] sock_def_readable+0x20/0x80 >>> but this lock took another, SOFTIRQ-unsafe lock in the past: >>> (slock-AF_INET){+.-...} >>> >>> and interrupts could create inverse lock ordering between them. >>> >>> other info that might help us debug this: >>> 1 lock held by sctp_darn/1517: >>> #0: (sk_lock-AF_INET){+.+.+.}, at: [<cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp] >>> >>> Signed-off-by: Wei Yongjun <yjwei@xxxxxxxxxxxxxx> >>> --- >>> include/net/sctp/sctp.h | 1 + >>> net/sctp/endpointola.c | 1 + >>> net/sctp/socket.c | 10 ++++++++++ >>> 3 files changed, 12 insertions(+), 0 deletions(-) >>> >>> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h >>> index 78740ec..fa6cde5 100644 >>> --- a/include/net/sctp/sctp.h >>> +++ b/include/net/sctp/sctp.h >>> @@ -128,6 +128,7 @@ extern int sctp_register_pf(struct sctp_pf *, sa_family_t); >>> int sctp_backlog_rcv(struct sock *sk, struct sk_buff *skb); >>> int sctp_inet_listen(struct socket *sock, int backlog); >>> void sctp_write_space(struct sock *sk); >>> +void sctp_data_ready(struct sock *sk, int len); >>> unsigned int sctp_poll(struct file *file, struct socket *sock, >>> poll_table *wait); >>> void sctp_sock_rfree(struct sk_buff *skb); >>> diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c >>> index 905fda5..7ec09ba 100644 >>> --- a/net/sctp/endpointola.c >>> +++ b/net/sctp/endpointola.c >>> @@ -144,6 +144,7 @@ static struct sctp_endpoint *sctp_endpoint_init(struct sctp_endpoint *ep, >>> /* Use SCTP specific send buffer space queues. */ >>> ep->sndbuf_policy = sctp_sndbuf_policy; >>> >>> + sk->sk_data_ready = sctp_data_ready; >>> sk->sk_write_space = sctp_write_space; >>> sock_set_flag(sk, SOCK_USE_WRITE_QUEUE); >>> >>> diff --git a/net/sctp/socket.c b/net/sctp/socket.c >>> index 67fdac9..b437e2a 100644 >>> --- a/net/sctp/socket.c >>> +++ b/net/sctp/socket.c >>> @@ -6185,6 +6185,16 @@ do_nonblock: >>> goto out; >>> } >>> >>> +void sctp_data_ready(struct sock *sk, int len) >>> +{ >>> + read_lock_bh(&sk->sk_callback_lock); >>> + if (sk_has_sleeper(sk)) >>> + wake_up_interruptible_sync_poll(sk->sk_sleep, POLLIN | >>> + POLLRDNORM | POLLRDBAND); >>> + sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN); >>> + read_unlock_bh(&sk->sk_callback_lock); >>> +} >>> + >>> /* If socket sndbuf has changed, wake up all per association waiters. */ >>> void sctp_write_space(struct sock *sk) >>> { >>> >> >> > -- To unsubscribe from this list: send the line "unsubscribe linux-sctp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html