Re: [PATCH] mm,page_alloc: Serialize warn_alloc() if schedulable.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On (06/03/17 17:36), Tetsuo Handa wrote:
[..]
> > Tetsuo is arguing that the locking will throttle warn_alloc callers and
> > that can help other processes to move on. I would call it papering over
> > a real issue which might be somewhere else and that is why I push back so
> > hard. The initial report is far from complete and seeing 30+ seconds
> > stalls without any indication that this is just a repeating stall after
> > 10s and 20s suggests that we got stuck somewhere in the reclaim path.
> 
> That timestamp jump is caused by the fact that log_buf writers are consuming
> more CPU times than log_buf readers can consume. If I leave that situation
> more, printk() just starts printing "** %u printk messages dropped ** " line.

hhmm... sorry, not sure I see how printk() would affect timer ticks. unless
you do printing from timer IRQs, or always in deferred printk() mode, which
runs from timer IRQ... timestamps are assigned at the moment we add a new
message to the logbuf, not when we print it. so slow serial console really
should not affect it. unless I'm missing something.

	vprintk_emit()
	{
		logbuf_lock_irqsave(flags);

		log_output(facility, level, lflags, dict, dictlen, text, text_len);
		{
			log_store()
			{
				msg->ts_nsec = local_clock();
						^^^^^^^^^^^^^
			}
		}

		logbuf_unlock_irqrestore(flags);

		if (console_trylock())
			console_unlock();
	}


I don't think vprintk_emit() was spinning on logbuf_lock_irqsave(),
you would have seen spinlock lockup reports otherwise. in console_unlock()
logbuf lock is acquired only to pick the first pending messages and,
basically, do memcpy() to a static buffer. we don't call "slow console
drivers" with the logbuf lock taken. so other CPUs are free/welcome to
append new messages to the logbuf in the meantime (and read accurate
local_clock()).

so if you see spikes in messages' timestamps it's most likely because
there was something between printk() calls that kept the CPU busy.

/* or you had a ton of printk calls from other CPUs with noisy loglevels
   that were suppressed later in console_unlock(), see later. */

... well, serial consoles can be slow, sure.


> Notice the timestamp jump between [  351.239144] and [  389.308085].

do you have a restrictive console loglevel and a ton of messages that
were simply filtered out by console loglevel check? we still store those
messages to the logbuf (for dmesg, etc.) and process them in console_unlock(),
but don't print to the serial console. so, in other words,

logbuf:

	timestamp T0	message M0		-- visible loglevel
	timestamp T1	message M1  		-- suppressed loglevel
	....
	timestamp T100	message M101		-- suppressed loglevel
	timestamp T101	message M102		-- visible loglevel

on the serial console you'll see

	T0	M0
	T101	M102

which might look like a spike in timestamps (while there weren't any).
just a thought.

	-ss

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux