Re: inconsistent lock state on v4.14.20-rt17

Roosen Henri <Henri.Roosen@xxxxxxxxxxxxx> · Thu, 8 Mar 2018 17:38:59 +0000

On Thu, 2018-03-08 at 16:57 +0100, bigeasy@xxxxxxxxxxxxx wrote:
> On 2018-03-07 08:26:40 [+0000], Roosen Henri wrote:
> > > The backtrace suggests that the rq lock is taken with interrupts
> > > disabled and then with interrupts enabled. But based on the call-
> > > trace
> > > it should be with interrupts disabled in both cases.
> > > I do have a imx6q running hackbench on a regular basis and I
> > > haven't
> > > seen this. Do you see this backtrace on every hackbench
> > > invocation or
> > > just after some time. The uptime suggest after ~5 hours.
> > 
> > The crashes are not seen at every invocation of hackbench, but
> > unfortunately after a few hours to after a few weeks. The test is
> > based
> > on a "dohell.sh" script from the Xenomai project.
> 
> I kicked a hackbench on a imx6 with your config.

Thanks Sebastian for looking into this!

> Is the backtrace, that you receive from lockdep, always the same or
> is
> it different sometimes?

It is different each time. So my gut feeling tells me it might be a
memory corruption of some kind.. maybe caused by a use after free or
so..

I restarted the target yesterday evening and this morning it was frozen
without any trace on the terminal. Attaching a JTAG showed different
call-stacks than yesterday; Core #2 (trying to print the info to the
terminal) and #3 were spinning on a spin-lock, don't understand what
core #0 and #1 were doing.

Core#0:
	__vectors_start() at entry-armv.S:1,234 0xffff000c	
	__pabt_svc() at entry-armv.S:319 0xc010cd84	
Core#1:
	__vectors_start() at entry-armv.S:1,234 0xffff000c	
	__pabt_svc() at entry-armv.S:319 0xc010cd84	
Core#2:
	arch_spin_lock() at spinlock.h:77 0xc017bf38	
	do_raw_spin_lock() at spinlock_debug.c:115 0xc017bf38	
	__raw_spin_lock() at spinlock_api_smp.h:143 0xc0996158	
	_raw_spin_lock() at spinlock.c:155 0xc0996158	
	vprintk_emit() at printk.c:1,804 0xc0183a8c	
	vprintk_default() at printk.c:1,906 0xc0183e8c	
	vprintk_func() at printk_safe.c:382 0xc0185764	
	printk() at printk.c:1,939 0xc0184fc4	
	__warn() at panic.c:523 0xc0123754	
	warn_slowpath_fmt() at panic.c:567 0xc0123870	
Core#3:
	arch_spin_lock() at spinlock.h:77 0xc017bf38	
	do_raw_spin_lock() at spinlock_debug.c:115 0xc017bf38	
	__raw_spin_lock() at spinlock_api_smp.h:143 0xc0996158	
	_raw_spin_lock() at spinlock.c:155 0xc0996158	
	__task_rq_lock() at core.c:105 0xc015347c	
	rt_mutex_setprio() at core.c:3,792 0xc0157148	
	rt_mutex_adjust_prio() at rtmutex.c:380 0xc017a100	
	task_blocks_on_rt_mutex() at rtmutex.c:1,374 0xc017a780	
	rt_spin_lock_slowlock_locked() at rtmutex.c:1,061 0xc09941ec
	rt_spin_lock_slowlock() at rtmutex.c:1,116 0xc099445c	
	rt_spin_lock_fastlock() at rtmutex.c:979 0xc09966c4	
	rt_spin_lock() at rtmutex.c:1,146 0xc09966c4	
	skb_queue_tail() at skbuff.c:2,915 0xc076eb08	
	unix_stream_sendmsg() at af_unix.c:1,908 0xc081dcf8	
	sock_sendmsg_nosec() at socket.c:639 0xc0760214	
	sock_sendmsg() at socket.c:643 0xc0760214	
	sock_write_iter() at socket.c:912 0xc07602b8	
	call_write_iter() at fs.h:1,771 0xc0255c68	
	new_sync_write() at read_write.c:469 0xc0255c68	
	__vfs_write() at read_write.c:482 0xc0255c68	
	vfs_write() at read_write.c:544 0xc0255e2c	
	SYSC_write() at read_write.c:589 0xc0255fd4	
	SyS_write() at read_write.c:581 0xc0255fd4	
	0xc01074e0	

Most of the times the call-stacks start at SyS_write() or SyS_read()
from hackbench.

Some things I found out by testing on v4.9:
- minimum test to reproduce problem "while true; do hackbench -g 100 -l
1000; done &"
- reproducible with "hackbench -T" (threads)
- reproducible only on iMX6Q, not (yet) on iMX6S, iMX6D
- NOT reproducible with "hackbench -p" (pipes)

As that might be pointing towards the streaming unix socketpair
hackbench is using from multiple forked processes, I had a look at
net/unix/af_unix.c and wondered why unix_stream_sendmsg() doesn't
increase the reference count on the "other" socket the same as
unix_dgram_sendmsg() does. I don't see a reason why "other" is handled
differently in both functions, so it smells fishy to me. But I'm not
familiar with the net-code, so maybe you could review if the diff below
would make sense:

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 7f46bab4ce5c..5f2ca91bc54d 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1854,7 +1854,7 @@ static int unix_stream_sendmsg(struct socket
*sock, struct msghdr *msg,
 		goto out_err;
 	} else {
 		err = -ENOTCONN;
-		other = unix_peer(sk);
+		other = unix_peer_get(sk);
 		if (!other)
 			goto out_err;
 	}
@@ -1910,7 +1910,7 @@ static int unix_stream_sendmsg(struct socket
*sock, struct msghdr *msg,
 		other->sk_data_ready(other);
 		sent += size;
 	}
-
+	sock_put(other);
 	scm_destroy(&scm);
 
 	return sent;
@@ -1923,6 +1923,8 @@ static int unix_stream_sendmsg(struct socket
*sock, struct msghdr *msg,
 		send_sig(SIGPIPE, current, 0);
 	err = -EPIPE;
 out_err:
+	if (other)
+		sock_put(other);
 	scm_destroy(&scm);
 	return sent ? : err;
 }
@@ -1934,13 +1936,13 @@ static ssize_t unix_stream_sendpage(struct
socket *socket, struct page *page,
 	bool send_sigpipe = false;
 	bool init_scm = true;
 	struct scm_cookie scm;
-	struct sock *other, *sk = socket->sk;
+	struct sock *other = NULL, *sk = socket->sk;
 	struct sk_buff *skb, *newskb = NULL, *tail = NULL;
 
 	if (flags & MSG_OOB)
 		return -EOPNOTSUPP;
 
-	other = unix_peer(sk);
+	other = unix_peer_get(sk);
 	if (!other || sk->sk_state != TCP_ESTABLISHED)
 		return -ENOTCONN;
 
@@ -2027,6 +2029,7 @@ static ssize_t unix_stream_sendpage(struct socket
*socket, struct page *page,
 	mutex_unlock(&unix_sk(other)->iolock);
 
 	other->sk_data_ready(other);
+	sock_put(other);
 	scm_destroy(&scm);
 	return size;
 
@@ -2038,6 +2041,8 @@ static ssize_t unix_stream_sendpage(struct socket
*socket, struct page *page,
 	kfree_skb(newskb);
 	if (send_sigpipe && !(flags & MSG_NOSIGNAL))
 		send_sig(SIGPIPE, current, 0);
+	if (other)
+		sock_put(other);
 	if (!init_scm)
 		scm_destroy(&scm);
 	return err;

> Sebastian

Thanks,
Henri

> ��.n��������+%������w��{.n�����{�����ǫ���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f