Hi Guys, We are talking here about a bug reported by Duncan here. His cpu/cpu*/cpufreq directory are getting corrupted with 3.9-rc3 and was working well with 3.8 https://bugzilla.kernel.org/show_bug.cgi?id=55411 On his AMD bulldozer tri-cluster/6-core system he doesn't see affected and related cpus set correctly after off-lining 1-5 and bringing them back with: for i in 1 2 3 4 5; do echo 0 > /sys/devices/system/cpu/cpu$i/online ; done for i in 1 2 3 4 5; do echo 1 > /sys/devices/system/cpu/cpu$i/online ; done Before running above two, cpufreq-info gave: https://bugzilla.kernel.org/attachment.cgi?id=95701 And after running above it gave: https://bugzilla.kernel.org/attachment.cgi?id=95711 Clearly it got corrupted. Somehow cpu 3 showed up in related cpus field of cpu 5. I suspect following patches behind this: commit fcf8058296edbc3de43adf095824fc32b067b9f8 Author: Viresh Kumar <viresh.kumar@xxxxxxxxxx> Date: Tue Jan 29 14:39:08 2013 +0000 cpufreq: Simplify cpufreq_add_dev() Currently cpufreq_add_dev() firsts allocates policy, calls driver->init() and then checks if this CPU is already managed or not. And if it is already managed, its policy is freed. We can save all this if we somehow know that CPU is managed or not in advance. policy->related_cpus contains the list of all valid sibling CPUs of policy->cpu. We can check this to see if the current CPU is already managed. From now on, platforms don't really need to set related_cpus from their init() routines, as the same work is done by core too. If a platform driver needs to set the related_cpus mask with some additional CPUs, other than CPUs present in policy->cpus, they are free to do it, though, as we don't override anything. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx> Tested-by: Shawn Guo <shawn.guo@xxxxxxxxxx> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> AND commit 643ae6e81dd65b333a13259852405fc9f764ac76 Author: Viresh Kumar <viresh.kumar@xxxxxxxxxx> Date: Sat Jan 12 05:14:38 2013 +0000 cpufreq: Manage only online cpus cpufreq core doesn't manage offline cpus and if driver->init() has returned mask including offline cpus, it may result in unwanted behavior by cpufreq core or governors. We need to get only online cpus in this mask. There are two places to fix this mask, cpufreq core and cpufreq driver. It makes sense to do this at common place and hence is done in core. Signed-off-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> And this is the latest piece of documentation available: SMP systems normally have same clock source for a group of cpus. For these the .init() would be called only once for the first online cpu. Here the .init() routine must initialize policy->cpus with mask of all possible cpus (Online + Offline) that share the clock. Then the core would copy this mask onto policy->related_cpus and will reset policy->cpus to carry only online cpus. I saw acpi-cpufreq drivers driver->init() code and found it is not yet aligned to this theory and probably that is causing these failures. I don't have enough knowledge about this driver and how is it used for all x86 systems and so want somebody else (who has some prior experience with it) to check how policy->cpus and policy->related_cpus must be set from driver->init(). -- viresh ---------- Forwarded message ---------- From: <bugzilla-daemon@xxxxxxxxxxxxxxxxxxx> Date: 19 March 2013 13:19 Subject: [Bug 55411] sysfs per-cpu cpufreq subdirs/symlinks screwed up after s2ram To: viresh.kumar@xxxxxxxxxx https://bugzilla.kernel.org/show_bug.cgi?id=55411 --- Comment #9 from Duncan <1i5t5.duncan@xxxxxxx> 2013-03-19 07:49:53 --- (In reply to comment #8) > (In reply to comment #0) >> After a s2ram/resume cycle (now bad): >> >> /sys/devices/system/cpu/cpu0/cpufreq/ >> /sys/devices/system/cpu/cpu1/cpufreq -> ../cpu0/cpufreq/ >> /sys/devices/system/cpu/cpu3/cpufreq/ >> /sys/devices/system/cpu/cpu5/cpufreq/ > > Can you try this rather than s2r: > > for i in 1 2 3 4 5; do echo 0 > /sys/devices/system/cpu/cpu$i/online ; done > for i in 1 2 3 4 5; do echo 1 > /sys/devices/system/cpu/cpu$i/online ; done > > and check the status if things are still corrupted for you? > Above doesn't corrupt anything for me Atleast. That's a nice easy test; no rebuild and reboot needed. =:^) Tho I had to change the > to >| as I have bash noclobber set and the files obviously already exist... Uncorrupted before the test, corrupted after. So just cycling the cpus off and then back online *DOES* corrupt cpufreq, thus a much simpler reproducer! =:^) Exact same ls results as the above. > And my system doesn't have S2R support for now. My old system didn't support s2ram reliably; it would work occasionally but mostly it didn't. But s2disk was workable for awhile, until the fact that I was running mdraid and the disks didn't always return in the same sdX slots due to hardware wakeup issues complicated things, so eventually I didn't use that much either. The new system's great with s2ram, sans this bug of course; s2disk didn't work at all at first, but last time I tried it /almost/ worked so there has been improvement. But I don't like to take unnecessary chances with filesystem log replay and thankfully wall power's good enough around here that I can s2ram for a day and come back and wiggle the mouse and all's fine (with a couple pre-suspend syncs thrown into my script just in case), so I tend to use it a LOT, even more than I'd use s2disk due to the speed. =:^) But I'd love to have s2both working reliably; for all I know it's actually working now; it was pretty close. But I prefer not to test the reiserfs log replay (even with pre-suspend syncs I worry, tho as I said reiserfs has actually been very good to me even thru faulty ram, a power supply blowing up on me, a mobo dying, etc, since 2.6.16 or whenever it was that it got ordered journaling by default) when it doesn't work, so knowing s2disk didn't work well when I tested it and with s2ram working SO well, I don't tend to test s2disk/s2both too often. Meanwhile, thanks for the cpuinfo_cur_freq explanation. If that actually real-time touches the hardware to get the data as you say, that does explain the root privs. Maybe that bit of extra info could be added to the documentation? I could propose some new wording and open a new bug on cpu-freq/user-guide.txt for it if appropriate. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. -- To unsubscribe from this list: send the line "unsubscribe cpufreq" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html