Re: S3 resume regression [1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu")]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 28, 2016 at 08:58:41PM +0200, Thomas Gleixner wrote:
> On Fri, 28 Oct 2016, Ville Syrjälä wrote:
> > On Thu, Oct 27, 2016 at 10:41:18PM +0200, Thomas Gleixner wrote:
> > > On Thu, 27 Oct 2016, Ville Syrjälä wrote:
> > > > On Thu, Oct 27, 2016 at 09:25:05PM +0200, Thomas Gleixner wrote:
> > > > > So it would be interesting whether that hunk in resume_broadcast() is
> > > > > sufficient.
> > > > 
> > > > So far it looks like the answer is yes.
> > > > 
> > > > Looks to be about 5 seconds slower than acpi-idle in resuming, but
> > > > I suppose that's not all that surprising ;)
> > > 
> > > Well, set it to 1msec then. If that works reliably then we really can do
> > > that unconditionally. There is no harm in firing a useless timer during
> > > resume once.
> > 
> > I narrowed down the required timeout, and looks like 25ms is the
> > minimum that works. With 24ms I already started to have failures. So
> > maybe just bump it up by an order of magnitude to 250ms for some
> > safety margin?

I left the thing running for the weekend and it failed 26 out of 16057
times with the 25ms timeout. Looks like it takes ~5 minutes to resume
when it fails, but eventually it does come back.

> 
> Sure, but what puzzles me is that we need a timeout that big. What happens
> between broadcast_resume() and broadcast_resume() + 25ms?
> 
> IOW, what is the event/resume function which we need to bridge. We should
> really try to track than down.

My hunch would be that SMM trap in the DSDT/SSDT since that's where
things ended up last time I was tracing these resume problems. Though I
can't recall if that was just with acpi-idle or if intel_idle landed in
the same spot as well.

I guess I can try to repeat that test tomorrow, or I'll try your function
tracer method if the other thing fails.

> 
> You might try to enable function tracing and do a tracing_off() when that
> 25ms timeout fires.
> 
> Something like 
> 
> 	stop_trace = true;
> 
> in broadcast_resume() and then in the broadcast timer function:
> 
> 	if (stop_trace) {
> 		stop_trace = false;
> 		tracing_off();
> 	}
> 
> Then when the machine is up read the trace, compress and upload it
> somewhere or send it in private mail if it's not that big.
> 
> Thanks,
> 
> 	tglx


-- 
Ville Syrjälä
Intel OTC
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux