On Wed, Feb 1, 2012 at 10:07 AM, Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> wrote: > On Wed, Feb 01, 2012 at 05:30:15PM +0000, Colin Cross wrote: >> On Wed, Feb 1, 2012 at 6:59 AM, Lorenzo Pieralisi >> <lorenzo.pieralisi@xxxxxxx> wrote: >> > On Wed, Feb 01, 2012 at 12:13:26PM +0000, Vincent Guittot wrote: >> > >> > [...] >> > >> >> >> In your patch, you put in safe state (WFI for most of platform) the >> >> >> cpus that become idle and these cpus are woken up each time a new cpu >> >> >> of the cluster becomes idle. Then, the cluster state is chosen and the >> >> >> cpus enter the selected C-state. On ux500, we are using another >> >> >> behavior for synchronizing the cpus. The cpus are prepared to enter >> >> >> the c-state that has been chosen by the governor and the last cpu, >> >> >> that enters idle, chooses the final cluster state (according to cpus' >> >> >> C-state). The main advantage of this solution is that you don't need >> >> >> to wake other cpus to enter the C-state of a cluster. This can be >> >> >> quite worth full when tasks mainly run on one cpu. Have you also think >> >> >> about such behavior when developing the coupled cpuidle driver ? It >> >> >> could be interesting to add such behavior. >> >> > >> >> > Waking up the cpus that are in the safe state is not done just to >> >> > choose the target state, it's done to allow the cpus to take >> >> > themselves to the target low power state. On ux500, are you saying >> >> > you take the cpus directly from the safe state to a lower power state >> >> > without ever going back to the active state? I once implemented Tegra >> >> >> >> yes it is >> > >> > But if there is a single power rail for the entire cluster, when a CPU >> > is "prepared" for shutdown this means that you have to save the context and >> > clean L1, maybe for nothing since if other CPUs are up and running the >> > CPU going idle can just enter a simple standby wfi (clock-gated but power on). >> > >> > With Colin's approach, context is saved and L1 cleaned only when it is >> > almost certain the cluster is powered off (so the CPUs). >> > >> > It is a trade-off, I am not saying one approach is better than the >> > other; we just have to make sure that preparing the CPU for "possible" shutdown >> > is better than sending IPIs to take CPUs out of wfi and synchronize >> > them (this happens if and only if CPUs enter coupled C-states). >> > >> > As usual this will depend on use cases (and silicon implementations :) ) >> > >> > It is definitely worth benchmarking them. >> > >> >> I'm less worried about performance, and more worried about race >> conditions. How do you deal with the following situation: >> CPU0 goes to WFI, and saves its state >> CPU1 goes idle, and selects a deep idle state that powers down CPU0 >> CPU1 saves is state, and is about to trigger the power down >> CPU0 gets an interrupt, restores its state, and modifies state (maybe >> takes a spinlock during boot) >> CPU1 cuts the power to CPU0 >> >> On OMAP4, the race is handled in hardware. When CPU1 tries to cut the >> power to the blocks shared by CPU0 the hardware will ignore the >> request if CPU0 is not in WFI. On Tegra2, there is no hardware >> support and I had to handle it with a spinlock implemented in scratch >> registers because CPU0 is out of coherency when it starts booting and >> ldrex/strex don't work. I'm not convinced my implementation is >> correct, and I'd be curious to see any other implementations. > > That's a problem you solved with coupled C-states (ie your example in > the cover letter), where the primary waits for other CPUs to be reset > before issuing the power down command, right ? At that point in time > secondaries cannot wake up (?) and if wfi (ie power down) aborts you just > take the secondaries out of reset and restart executing simultaneously, > correct ? It mirrors the suspend behaviour, which is easier to deal with > than completely random idle paths. Yes, anything that supports hotplug and suspend should support coupled cpuidle states fairly easily. The only thing required that is not already used by hotplug/suspend is the ability to save and restore context on cpu1, but most implementations end up doing that already. > It is true that this should be managed by the PM HW; if HW is not > capable of managing these situations things get nasty as you highlighted. Yes - on some platforms, the HW is not designed to handle it. On others, it is designed to, but due to HW bugs it cannot be used. > And it is also true ldrex/strex on cacheable memory might not be available in > those early warm-boot stages. I came up with a locking algorithm on > strongly ordered memory to deal with that, but I am still not sure it is > something we really really need. I did the same, but with device memory. > I will test coupled C-state code ASAP, and come back with feedback. > > Thanks, > Lorenzo > -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html