Re: ppoll() stuck on POLLIN while TCP peer is sending

Mel Gorman <mgorman@xxxxxxx> · Fri, 4 Jan 2013 16:01:48 +0000

On Wed, Jan 02, 2013 at 08:08:48PM +0000, Eric Wong wrote:
> (changing Cc:)
> 
> Eric Wong <normalperson@xxxxxxxx> wrote:
> > I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a
> > local TCP socket.  The isolated code below can reproduces the issue
> > after many minutes (<1 hour).  It might be easier to reproduce on
> > a busy system while disk I/O is happening.
> 
> s/might be/is/
> 
> Strangely, I've bisected this seemingly networking-related issue down to
> the following commit:
> 
>   commit 1fb3f8ca0e9222535a39b884cb67a34628411b9f
>   Author: Mel Gorman <mgorman@xxxxxxx>
>   Date:   Mon Oct 8 16:29:12 2012 -0700
> 
>       mm: compaction: capture a suitable high-order page immediately when it is made available
> 
> That commit doesn't revert cleanly on v3.7.1, and I don't feel
> comfortable touching that code myself.
> 

That patch introduced an accounting bug that was corrected by ef6c5be6
(fix incorrect NR_FREE_PAGES accounting (appears like memory leak)). In
some cases that could look like a hang and potentially confuses a bisection.

That said, I see that you report that 3.7.1 and 3.8-rc2 are affected that
includes that fix and the finger is pointed at compaction so something
is wrong.

> Instead, I disabled THP+compaction under v3.7.1 and I've been unable to
> reproduce the issue without THP+compaction.
> 

Implying that it's stuck in compaction somewhere. It could be the case
that compaction alters timing enough to trigger another bug. You say it
tests differently depending on whether TCP or unix sockets are used
which might indicate multiple problems. However, lets try and see if
compaction is the primary problem or not.

> As I mention in http://mid.gmane.org/20121229113434.GA13336@xxxxxxxxxxxxx
> I run my below test (`toosleepy') with heavy network and disk activity
> for a long time before hitting this.
> 

Using a 3.7.1 or 3.8-rc2 kernel, can you reproduce the problem and then
answer the following questions please?

1. What are the contents of /proc/vmstat at the time it is stuck?

2. What are the contents of /proc/PID/stack for every toosleepy
   process when they are stuck?

3. Can you do a sysrq+m and post the resulting dmesg?

What I'm looking for is a throttling bug (if pgscan_direct_throttle is
elevated), an isolated page accounting bug (nr_isolated_* is elevated
and process is stuck in congestion_wait in a too_many_isolated() loop)
or a free page accounting bug (big difference between nr_free_pages and
buddy list figures).

I'll try reproducing this early next week if none of that shows an
obvious candidate.

Thanks.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>