On Mon, Sep 28, 2020 at 1:34 AM Mark Mielke <mark.mielke@xxxxxxxxx> wrote: > > Kernel may fail to boot or devices may fail to come up when > initializing iscsi_tcp devices starting with Linux 5.8. > > Marc Dionne identified the cause in RHBZ#1877345. > > Commit a79af8a64d39 ("[SCSI] iscsi_tcp: use iscsi_conn_get_addr_param > libiscsi function") introduced getpeername() within the session spinlock. > > Commit 1b66d253610c ("bpf: Add get{peer, sock}name attach types for > sock_addr") introduced BPF_CGROUP_RUN_SA_PROG_LOCK() within getpeername, > which acquires a mutex and when used from iscsi_tcp devices can now lead > to "BUG: scheduling while atomic:" and subsequent damage. > > This commit ensures that the spinlock is released before calling > getpeername() or getsockname(). sock_hold() and sock_put() are > used to ensure that the socket reference is preserved until after > the getpeername() or getsockname() complete. > > Reported-by: Marc Dionne <marc.c.dionne@xxxxxxxxx> > Link: https://bugzilla.redhat.com/show_bug.cgi?id=1877345 > Link: https://lkml.org/lkml/2020/7/28/1085 > Link: https://lkml.org/lkml/2020/8/31/459 > Fixes: a79af8a64d39 ("[SCSI] iscsi_tcp: use iscsi_conn_get_addr_param libiscsi function") > Fixes: 1b66d253610c ("bpf: Add get{peer, sock}name attach types for sock_addr") > Signed-off-by: Mark Mielke <mark.mielke@xxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > --- > drivers/scsi/iscsi_tcp.c | 22 +++++++++++++++------- > 1 file changed, 15 insertions(+), 7 deletions(-) > > diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c > index b5dd1caae5e9..d10efb66cf19 100644 > --- a/drivers/scsi/iscsi_tcp.c > +++ b/drivers/scsi/iscsi_tcp.c > @@ -736,6 +736,7 @@ static int iscsi_sw_tcp_conn_get_param(struct iscsi_cls_conn *cls_conn, > struct iscsi_tcp_conn *tcp_conn = conn->dd_data; > struct iscsi_sw_tcp_conn *tcp_sw_conn = tcp_conn->dd_data; > struct sockaddr_in6 addr; > + struct socket *sock; > int rc; > > switch(param) { > @@ -747,13 +748,17 @@ static int iscsi_sw_tcp_conn_get_param(struct iscsi_cls_conn *cls_conn, > spin_unlock_bh(&conn->session->frwd_lock); > return -ENOTCONN; > } > + sock = tcp_sw_conn->sock; > + sock_hold(sock->sk); > + spin_unlock_bh(&conn->session->frwd_lock); > + > if (param == ISCSI_PARAM_LOCAL_PORT) > - rc = kernel_getsockname(tcp_sw_conn->sock, > + rc = kernel_getsockname(sock, > (struct sockaddr *)&addr); > else > - rc = kernel_getpeername(tcp_sw_conn->sock, > + rc = kernel_getpeername(sock, > (struct sockaddr *)&addr); > - spin_unlock_bh(&conn->session->frwd_lock); > + sock_put(sock->sk); > if (rc < 0) > return rc; > > @@ -775,6 +780,7 @@ static int iscsi_sw_tcp_host_get_param(struct Scsi_Host *shost, > struct iscsi_tcp_conn *tcp_conn; > struct iscsi_sw_tcp_conn *tcp_sw_conn; > struct sockaddr_in6 addr; > + struct socket *sock; > int rc; > > switch (param) { > @@ -789,16 +795,18 @@ static int iscsi_sw_tcp_host_get_param(struct Scsi_Host *shost, > return -ENOTCONN; > } > tcp_conn = conn->dd_data; > - > tcp_sw_conn = tcp_conn->dd_data; > - if (!tcp_sw_conn->sock) { > + sock = tcp_sw_conn->sock; > + if (!sock) { > spin_unlock_bh(&session->frwd_lock); > return -ENOTCONN; > } > + sock_hold(sock->sk); > + spin_unlock_bh(&session->frwd_lock); > > - rc = kernel_getsockname(tcp_sw_conn->sock, > + rc = kernel_getsockname(sock, > (struct sockaddr *)&addr); > - spin_unlock_bh(&session->frwd_lock); > + sock_put(sock->sk); > if (rc < 0) > return rc; > > -- > 2.28.0 > Works for me and prevents the iscsid crash. Tested-by: Marc Dionne <marc.c.dionne@xxxxxxxxx>