On Wed, Apr 13, 2011 at 17:53, Rafael J. Wysocki wrote: > On Wednesday, April 13, 2011, Mike Frysinger wrote: >> On Wed, Apr 13, 2011 at 17:05, Pavel Machek wrote: >> > On Wed 2011-04-13 17:02:45, Mike Frysinger wrote: >> >> On Wed, Apr 13, 2011 at 16:58, Rafael J. Wysocki wrote: >> >> > On Wednesday, April 13, 2011, Mike Frysinger wrote: >> >> >> when we suspend/resume Blackfin SMP systems, we notice that the >> >> >> freezer code runs on multiple cores. Âthis is of course what you want >> >> >> -- freeze processes in parallel. Âhowever, the code only uses non-smp >> >> >> based barriers which causes us problems ... our cores need software >> >> >> support to keep caches in sync, so our smp barriers do just that. Âbut >> >> >> the non-smp barriers do not, and so the frozen/thawed processes >> >> >> randomly get stuck in the wrong task state. >> >> >> >> >> >> thinking about it, shouldnt the freezer code be using smp barriers ? >> >> > >> >> > Yes, it should, but rmb() and wmb() are supposed to be SMP barriers. >> >> > >> >> > Or do you mean something different? >> >> >> >> then what's the diff between smp_rmb() and rmb() ? >> >> >> >> this is what i'm proposing: >> >> --- a/kernel/freezer.c >> >> +++ b/kernel/freezer.c >> >> @@ -17,7 +17,7 @@ static inline void frozen_process(void) >> >> Â{ >> >> Â Â if (!unlikely(current->flags & PF_NOFREEZE)) { >> >> Â Â Â Â current->flags |= PF_FROZEN; >> >> - Â Â Â wmb(); >> >> + Â Â Â smp_wmb(); >> >> Â Â } >> >> Â Â clear_freeze_flag(current); >> >> Â} >> >> @@ -93,7 +93,7 @@ bool freeze_task(struct task_struct *p, bool sig_only) >> >> Â Â Â* the task as frozen and next clears its TIF_FREEZE. >> >> Â Â Â*/ >> >> Â Â if (!freezing(p)) { >> >> - Â Â Â rmb(); >> >> + Â Â Â smp_rmb(); >> >> Â Â Â Â if (frozen(p)) >> >> Â Â Â Â Â Â return false; >> > >> > smp_rmb() is NOP on uniprocessor. >> > >> > I believe the code is correct as is. >> >> that isnt what the code / documentation says. Âunless i'm reading them >> wrong, both seem to indicate that the proposed patch is what we >> actually want. > > Not really. > >> include/linux/compiler-gcc.h: >> #define barrier() __asm__ __volatile__("": : :"memory") >> >> include/asm-generic/system.h: >> #define mb() Â Âasm volatile ("": : :"memory") >> #define rmb() Â mb() >> #define wmb() Â asm volatile ("": : :"memory") >> >> #ifdef CONFIG_SMP >> #define smp_mb() Â Âmb() >> #define smp_rmb() Â rmb() >> #define smp_wmb() Â wmb() >> #else >> #define smp_mb() Â Âbarrier() >> #define smp_rmb() Â barrier() >> #define smp_wmb() Â barrier() >> #endif > > The above means that smp_*mb() are defined as *mb() if CONFIG_SMP is set, > which basically means that *mb() are more restrictive than the corresponding > smp_*mb(). ÂMore precisely, they also cover the cases in which the CPU > reorders instructions on uniprocessor, which we definitely want to cover. > > IOW, your patch would break things on uniprocessor where the CPU reorders > instructions. > >> Documentation/memory-barriers.txt: >> SMP memory barriers are reduced to compiler barriers on uniprocessor compiled >> systems because it is assumed that a CPU will appear to be self-consistent, >> and will order overlapping accesses correctly with respect to itself. > > Exactly, which is not guaranteed in general (e.g. on Alpha). ÂThat is, some > CPUs can reorder instructions in such a way that a compiler barrier is not > sufficient to prevent breakage. > > The code _may_ be wrong for a different reason, though. ÂI need to check. so the current code is protecting against a UP system swapping in/out freezer threads for processes, and the barriers are to make sure that the updated flags variable is posted by the time another swapped in thread gets to that point. i guess the trouble for us is that you have one CPU posting writes to task->flags (and doing so by grabbing the task's spinlock), but the other CPU is simply reading those flags. there are no SMP barriers in between the read and write steps, nor is the reading CPU grabbing any locks which would be an implicit SMP barrier. since the Blackfin SMP port lacks hardware cache coherency, there is no way for us to know "we've got to sync the caches before we can do this read". by using the patch i posted above, we have that signal and so things work correctly., -mike _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/linux-pm