Re: Ad: [CPUFREQ] Cleanup governor struct, remove userspace specific set_speed functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Monday 10 August 2009 20:50:47 Pallipadi, Venkatesh wrote:
> On Mon, 2009-08-10 at 08:48 -0700, Thomas Renninger wrote:
> > On Saturday 08 August 2009 15:55:12 Dominik Brodowski wrote:
> > > Hey,
> > > 
> > > how does this new patch square with 
> > > 
> > > commit 9e76988e9390a4ff4d171f690586d0c58186b47e
> > > Author: Venki Pallipadi <venkatesh.pallipadi@xxxxxxxxx>
> > > Date:   Fri Oct 26 10:18:21 2007 -0700
> > > 
> > > [CPUFREQ] Eliminate cpufreq_userspace scaling_setspeed deadlock
> > > 
> > Oops, I thought the scaling_setspeed always was in cpufreq.c and
> > my patch was intended as cleanup. I did not realize that it
> > explicitly
> > was moved there to workaround a deadlock and my patch is more or 
> > less a revert of that workaround.
> 
> Yes. Looking at the code, this deadlock should happen again with the
> new code.
> 
> > On the other hand side I wonder why this should not work (having
> > show/store_scaling_setspeed in cpufreq_userspace.c) as
> > ondemand/conservative do the same.
> > Hmm, but it seems to work with recent changes. I tried to run into
> > this with two test apps run in parallel (see at the end) and could
> > not run into a deadlock.
> > 
> 
> I can't think of any recent changes that will help with this particular
> deadlock. ondemand/conservative does not have this problem as they add
> their own sysfs directory and not a file under cpufreq directory and
> they do not go through cpufreq rwsem.
> 
> The problem with setspeed was, it goes through show/store in cpufreq.c
> which will try to hold cpufreq rwsem. And userspace stop flow also
> holds rwsem and cannot remove sysfs setspeed file on STOP path, as
> there is a blocking read/write of setspeed.
>
> IIRC, tests were similar to what you have below. We had two tasks and
> we typically hit the deadlock, with cpufreq debug messages turned on,
> in about 30 mins or so.
I gave them a try with recent cpufreq next tree and my "machine wide
governor" patch reverted as this one would make things run through
possible other paths.

I rather quickly run into:
WARNING: at fs/sysfs/dir.c:487 sysfs_add_one+0xcf/0xe6()
Hardware name: To Be Filled By O.E.M.
sysfs: cannot create duplicate filename '/devices/system/cpu/cpu0/cpufreq/scaling_setspeed'
Modules linked in: cpufreq_ondemand powernow_k8 freq_table thermal processor
Pid: 30045, comm: governor_change Tainted: G       AW  2.6.31-rc4 #16
Call Trace:
[<ffffffff811200e0>] ? sysfs_add_one+0xcf/0xe6
[<ffffffff810401c3>] warn_slowpath_common+0x77/0xa4
[<ffffffff8104023d>] warn_slowpath_fmt+0x3c/0x3e
[<ffffffff811200e0>] sysfs_add_one+0xcf/0xe6
[<ffffffff8111fa79>] sysfs_add_file_mode+0x57/0x8b
[<ffffffff8111fab9>] sysfs_add_file+0xc/0xe
[<ffffffff8111fb67>] sysfs_create_file+0x25/0x27
[<ffffffff8138346e>] cpufreq_governor_userspace+0x66/0x2fe
[<ffffffff813818f8>] __cpufreq_governor+0xa2/0xe0
[<ffffffff8138259d>] __cpufreq_set_policy+0x1a9/0x225
[<ffffffff813829ec>] store_scaling_governor+0x1f8/0x232

After reverting c294f894e4af611213257386e2f46f82c4554c1b
(I commented these two lines out):
	//				unlock_policy_rwsem_write(data->cpu);
	__cpufreq_governor(data, CPUFREQ_GOV_STOP);
	//				lock_policy_rwsem_write(data->cpu);

the test run successfully for quite some time.
That means commit c294f894e4af611213257386e2f46f82c4554c1b modifies
cpufreq core in a way that .governor(START) can be called twice without
.governor(STOP) being called in between on the same cpu -> bad.

Venki: Haven't you had patches to do this in a cleaner way so that
.governor(STOP) is also called with rwsem held?

It took really long, but I could run into a deadlock when removing
setspeed sysfs file which is probably what you meant. Backtrace of
the two tasks is attached. So the cleanup:
[CPUFREQ] Cleanup governor struct, remove userspace specific set_speed functions
git commit 5b2d1ede0acf0b8c3d26df969c06cd238b903562
has to be reverted again...


Thanks Dominik for catching this!

   Thomas

> # task 1
> while :
> do
> cat /sys/devices/system/cpu/cpu0/cpufreq/*
> done
> 
> # task 2
> while :
> do
> echo userspace > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> done
> 
> Thanks,
> Venki
> 
> 
> > I still like to do another test with it if before this gets
> > merged into a mainline kernel.
> > The global ondemand/conservative sysfs changes should be fine.
> > 
> > 
> > 
> > Test apps run in parallel without any deadlock occuring:
> > 
> > cat_cpufreq.sh
> > ==============
> > #!/bin/bash
> > 
> > for((x=0;x<100000;x++));do
> >         cat /sys/devices/system/cpu/cpu*/cpufreq/*
> >         cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_setspeed
> >         usleep 100
> > done
> > 
> > governor_change.sh
> > ==================
> > #!/bin/bash
> > 
> > set -x
> > for((x=0;x<100000;x++));do
> >     for y in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor;do
> >         echo userspace >$y
> >     done
> >     usleep 100
> >     for y in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor;do
> >         echo performance >$y
> >     done
> >     usleep 100
> > done
> 
> 


[77908.519972]   task                        PC stack   pid father
[77908.519972] governor_chan D 0000000000000002  5640  5889   5790 0x00000004
[77908.519972]  ffff8801f90a3978 0000000000000046 ffff8801f90a38d8 ffffffff81065f41
[77908.519972]  00000000001d2800 000000000000ca28 ffff8801fd99cdc0 ffffffff81788020
[77908.519972]  ffff8801fd99d150 000000008103a4f7 ffffffff8103a4b3 00000000001d2800
[77908.519972] Call Trace:
[77908.519972]  [<ffffffff81065f41>] ? trace_hardirqs_on+0xd/0xf
[77908.519972]  [<ffffffff8103a4b3>] ? finish_task_switch+0x3b/0xd6
[77908.519972]  [<ffffffff81524e32>] ? thread_return+0x3e/0xc3
[77908.519972]  [<ffffffff81033348>] ? cpuacct_charge+0x1c/0x96
[77908.519972]  [<ffffffff815251e5>] schedule_timeout+0x22/0xcc
[77908.519972]  [<ffffffff81524fff>] ? wait_for_common+0x2b/0x136
[77908.519972]  [<ffffffff81065f41>] ? trace_hardirqs_on+0xd/0xf
[77908.519972]  [<ffffffff81526954>] ? _spin_unlock_irq+0x2b/0x30
[77908.519972]  [<ffffffff815250a0>] wait_for_common+0xcc/0x136
[77908.519972]  [<ffffffff8103bad6>] ? default_wake_function+0x0/0xf
[77908.519972]  [<ffffffff811e4292>] ? __spin_lock_init+0x31/0x53
[77908.519972]  [<ffffffff81525194>] wait_for_completion+0x18/0x1a
[77908.519972]  [<ffffffff8112ec7a>] sysfs_addrm_finish+0x21c/0x27c
[77908.519972]  [<ffffffff8112e855>] ? sysfs_addrm_start+0x78/0xa5
[77908.519972]  [<ffffffff8112e855>] ? sysfs_addrm_start+0x78/0xa5
[77908.519972]  [<ffffffff8112d152>] sysfs_hash_and_remove+0x4e/0x65
[77908.519972]  [<ffffffff8112de88>] sysfs_remove_file+0x10/0x12
[77908.519972]  [<ffffffff8139a503>] cpufreq_governor_userspace+0x17b/0x30c
[77908.519972]  [<ffffffff8139882f>] __cpufreq_governor+0xa2/0xe0
[77908.519972]  [<ffffffff81398b9a>] __cpufreq_set_policy+0x17f/0x211
[77908.519972]  [<ffffffff81399207>] store_scaling_governor+0x1fc/0x23c
[77908.519972]  [<ffffffff81399b96>] ? handle_update+0x0/0x33
[77908.519972]  [<ffffffff81399c8f>] store+0x62/0x87
[77908.519972]  [<ffffffff8112da9d>] sysfs_write_file+0xe4/0x119
[77908.519972]  [<ffffffff810daa7e>] vfs_write+0xab/0x105
[77908.519972]  [<ffffffff810dab9c>] sys_write+0x47/0x6f
[77908.519972]  [<ffffffff8100baeb>] system_call_fastpath+0x16/0x1b
[77908.519972] cat           D ffff8801f6de7f48  5896 23761   5891 0x00000004
[77908.519972]  ffff8801f6de7dd8 0000000000000046 0000000000000000 0000000000000002
[77908.519972]  00000000001d2800 000000000000ca28 ffff8801f91a4dc0 ffff8801fd99cdc0
[77908.519972]  ffff8801f91a5150 00000000815264a5 ffffc900001cfb90 0000000000000046
[77908.519972] Call Trace:
[77908.519972]  [<ffffffff81526520>] __down_read+0x97/0xb3
[77908.519972]  [<ffffffff81525dde>] down_read+0x59/0x6d
[77908.519972]  [<ffffffff8139a1e2>] ? lock_policy_rwsem_read+0x48/0x78
[77908.519972]  [<ffffffff8139a1e2>] lock_policy_rwsem_read+0x48/0x78
[77908.519972]  [<ffffffff8139a29d>] show+0x3a/0x79
[77908.519972]  [<ffffffff8112db8c>] sysfs_read_file+0xba/0x13c
[77908.519972]  [<ffffffff810dac6c>] vfs_read+0xa8/0x102
[77908.519972]  [<ffffffff810dad8a>] sys_read+0x47/0x70
[77908.519972]  [<ffffffff8100baeb>] system_call_fastpath+0x16/0x1b

[Index of Archives]     [Linux Kernel Devel]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Forum]     [Linux SCSI]

  Powered by Linux