On Mon, Oct 15, 2012 at 9:28 AM, Shawn Guo <shawn.guo@xxxxxxxxxx> wrote: > Changed subject (was: Re: [PATCH 7/7] ARM: tegra30: cpuidle: add LP2 > driver for CPU0) ... > > On Fri, Oct 12, 2012 at 01:50:35PM -0700, Colin Cross wrote: >> Current coupled cpuidle requires all cpus to wake up together and go >> back to the idle governor to select a new state, because that's what >> all available ARM cpus did when I wrote it. You should be able to >> implement coupled idle on a cpu that does not need to wake all the >> cpus if you wake them anyways, > > Can you please elaborate it a little bit? I do not quite understand > what you mean. In the current coupled cpuidle implementation, no cpu will return back to the scheduler until all cpus have returned from their idle function. That means you need to force each cpu to wake up whenever one of them wakes up. This is a limitation in the current coupled cpuidle design due to all existing coupled cpuidle implemenations having it as a hardware requirement, but is not a hardware requirement for your WAIT mode. Until somebody fixes that, you have to fake wakeups to the other cpus. > Basically, imx6q has a low-power mode named WAIT. When all 4 cores > are in WFI, imx6q will go into WAIT mode, and whenever there is a > wakeup interrupt, it will exit WAIT mode. Software can configure > hardware behavior during WAIT mode, clock gating or power gating for > ARM core. > > I'm trying to implement this low-power mode with coupled cpuidle. > I initially have the following code as the coupled cpuidle enter > function for clock gating case. > > static int imx6q_enter_wait_coupled(struct cpuidle_device *dev, > struct cpuidle_driver *drv, int index) > { > int cpu = dev->cpu; > > clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu); > > if (atomic_inc_return(&master) == num_online_cpus()) > imx6q_set_lpm(WAIT_UNCLOCKED); > > cpu_do_idle(); > > if (atomic_dec_return(&master) == num_online_cpus() - 1) > imx6q_set_lpm(WAIT_CLOCKED); > > clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu); > return index; > } > > However I think there are a couple of problems. The first one is the > extra wakeups you have mentioned. When last cpu is in call > imx6q_set_lpm(), some of the first 3 cpus that are already in WFI may > exit from there. The second one is when hardware exits WAIT mode and > has ARM clock resumed, some cpus may stay in WFI if scheduler has > nothing to wake them up. This breaks the requirement of coupled cpuidle > that all cpus need to exit together. I can force the function to work > around the problems by adding cpuidle_coupled_parallel_barrier() just > above cpu_do_idle() and arch_send_wakeup_ipi_mask(cpu_online_mask) > right after imx6q_set_lpm(WAIT_CLOCKED). But I doubt it's appropriate. That is the appropriate way of getting it working with today's coupled cpuidle. > So I rewrite the function as below to not use coupled cpuidle, and it > works fine. > > static int imx6q_enter_wait(struct cpuidle_device *dev, > struct cpuidle_driver *drv, int index) > { > int cpu = dev->cpu; > > clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu); > > if (atomic_inc_return(&master) == num_online_cpus()) { > /* > * With this lock, we prevent the other cpu to exit and > * enter this function again and become the master. > */ > if (!spin_trylock(&master_lock)) > goto idle; > > imx6q_set_lpm(WAIT_UNCLOCKED); > cpu_do_idle(); > imx6q_set_lpm(WAIT_CLOCKED); > > spin_unlock(&master_lock); > goto out; > } > > idle: > cpu_do_idle(); > out: > atomic_dec(&master); > clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu); > return index; > } Coupled cpuidle provides two features, and you only need one of them for your clock gating mode. The more complex one is sequencing, which you don't need since your hardware can check if all cpus are in WFI for you. The other feature is voting, to make sure all cpus can accept the exit latency of the deeper idle state. The atomic_inc_return is handling voting in your example, but it can't handle one cpu requesting power gating while the other requests clock gating. > For power gating case, it may better fit coupled cpuidle usage model, > since all the cores will have to run resume routine and thus will not > be in WFI when hardware exits from WAIT mode. But we still need to > work out the extra wakeup issue which will also be seen in this case. Yes, but using coupled cpuidle for both clock gating and power gating will allow you to get into clock gating when one cpu wants clock gating and the other wants power gating. >> and then later you can optimize it to >> avoid the extra wakeups when the extension to coupled cpuidle that I >> discussed previously gets implemented. > > Do you have a pointer to the initial discussion for the extension, or > you have a draft patch for that already (hopefully:)? I can't find the initial discussion, and I'm not going to have a chance to work on it for at least a few weeks. The basic idea is: Add "preidle" and "postidle" function pointers to the coupled cpuidle states (better names TBD). When all cpus are waiting, and before sending any pokes, call the preidle hook on the last cpu to start waiting. It must only be called on one cpu. The preidle hook is responsible for disabling wakeups on secondary cpus so they can't wake up, and returning a mask of the cpus it has disabled. The coupled core can increment ready count by the number of cpus that are disabled and skip poking them. After coupled idle returns (or when aborting coupled idle), call the postidle hook with the mask returned by the pre-idle hook, which re-enables wakeups on secondary cpus and returns a mask of cpus that are not waking because they are already in a state that can handle interrupts (WFI). The calling cpu will then decrement the ready count by the number of cpus that are not waking. There are some tricky synchronization problems to solve: Only one cpu can call the preidle and postidle hooks. On some hardware, they may not even be the same cpu. A cpu that is left in the WFI state by the postidle hook is back in the "waiting but not ready" state, and the current coupled cpuidle code does not support going from waiting->ready->waiting. Once the postidle hook has returned, some cpus may be able to immediately handle interrupts and return to the scheduler. -- To unsubscribe from this list: send the line "unsubscribe linux-tegra" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html