Re: [PATCH 2/3] PERF(kernel): Cleanup power events

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Ingo Molnar (mingo@xxxxxxx) wrote:
> 
> * Thomas Renninger <trenn@xxxxxxx> wrote:
> 
> > On Monday 25 October 2010 12:04:28 Ingo Molnar wrote:
> > > 
> > > * Thomas Renninger <trenn@xxxxxxx> wrote:
> > > 
> > > > New power trace events:
> > > > power:processor_idle
> > > > power:processor_frequency
> > > > power:machine_suspend
> > > > 
> > > > 
> > > > C-state/idle accounting events:
> > > >   power:power_start
> > > >   power:power_end
> > > > are replaced with:
> > > >   power:processor_idle
> > > 
> > > Well, most power saving hw models (and the code implementing them) have this kind of 
> > > model:
> > > 
> > >  enter power saving mode X
> > >  exit power saving mode
> > > 
> > > Where X is some sort of 'power saving deepness' attribute, right?
> >
> > Sure.
> 
> Which is is the 'saner' model?
> 
> > But ACPI and afaik this model got picked up for PCI and other (sub-)archs as well, 
> > defines state 0 as the non-power saving mode.
> 
> But the actual code does not actually deal with any 'state 0', does it? It enters an 
> idle function and then exits it, right?
> 
> 'power state' might be what is used for devices - but even there, we have:
> 
>   - enter power state X
>   - exit power state
> 
> right?
> 
> > Same as done here with machine suspend state (S0 is back from suspend) and
> > this model should get picked up when device sleep states get tracked at
> > some time.
> >
> > It's consistent and applies to some well known specifications.
> 
> What we want it to be is for it to be the nicest, most understandable, most logical 
> model - not one matching random hardware specifications.
> 
> ( Hardware specifications only matter in so far that it should be possible to 
>   express all the known hardware state transitions via these events efficiently. )
> 
> > Also tracking processor_idle_{start,end} as a separate event makes no sense and 
> > there is no need to introduce: processor_idle_start/processor_idle_end 
> > machine_suspend_start/machine_suspend_end 
> > device_power_mode_start/device_power_mode_end events.
> 
> What do you mean by "makes no sense"?
> 
> Are they superfluous? Inefficient? Illogical?

I think it would require deep understanding of specific power modes of each
architecture to split into this topology. On the bright side, it would bring
clear understanding of which HW resource is being put to sleep, which would make
automated analysis much easier to do. But maybe it's too much pain compared to
the benefit. The related question is also: where is it best to put this logic ?
In the kernel code ? In per-arch TRACE_EVENT() handlers or in external trace
analysis plugins ?

> 
> > Using state 0 as "exit/end", is much nicer for kernel/ userspace 
> > implementations/code and the user.
> 
> By that argument we should not have separate fork() and exit() syscalls either, but 
> a set_process_state(1) and set_process_state(0) interface?

I'm by no mean expert on power saving hardware specs, but if it is possible for
hardware to switch between two power saving states without passing through power
state 0, then using a "set state" rather than an enter/exit would be more
appropriate; even if we go for a scheme introducing

processor_idle_start/processor_idle_end,
machine_suspend_start/machine_suspend_end,
device_power_mode_start/device_power_mode_end.

I must defer to you guys to figure out if some hardware actually do that for
either of CPU idle, suspend or device power modes.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-trace-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux USB Development]     [Linux USB Development]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux