Re: soft lockup in the target processing thread

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Wed, 21 Dec 2011 14:53:21 -0800

On Wed, 2011-12-21 at 13:20 -0500, Christoph Hellwig wrote:
> On Mon, Dec 12, 2011 at 09:34:11AM -0800, Roland Dreier wrote:
> > On Mon, Dec 12, 2011 at 6:09 AM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> > > [ ?968.116635] ?[<ffffffff8175465c>] transport_processing_thread+0x13c/0x4a0
> > 
> > If you look at transport_processing_thread(), you'll notice that the
> > only time it
> > goes to sleep in wait_event is if dev->dev_queue_obj.queue_cnt is 0.
> > 
> > So as long as there is work being queued up, the thread will continue to run,
> > even for the 22s you saw in your soft lockup report.
> 
> Indeed, you mentioned it before, but the raw_spin_unlock in the thread
> confused me.  Adding a cond_resched indeed fixed it.  But without the
> cond_resched the VM is indeed hung.  It's a single processor one, and
> thus doesn't manage to get anything else done until we yield.
> 

So yeah, it makes sense that a single processor guest is causing
problems during yield when se_device->depth_left == 0.

Anyways, since we are removing se_device TCQ depth checking all-together
for v3.3, it should be a moot point now with the following:

target: Drop se device TCQ queue depth usage from I/O path
http://git.kernel.org/?p=linux/kernel/git/nab/target-pending.git;a=commitdiff;h=65586d51e0986be574118286c3d0007e903a2add

However for v3.2-final + stable, I'm happy to accept a patch using
cond_resched() to address this case for UP usage.

Thanks,

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html