Re: [PATCH] cpufreq: Fix NULL reference crash while accessing policy->governor_data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27/01/16 08:40, Viresh Kumar wrote:
> On 26-01-16, 09:57, Juri Lelli wrote:
> > This patch fixes the crash I was seeing.
> > 
> > Tested-by: Juri Lelli <juri.lelli@xxxxxxx>
> 
> Thanks.
> 
> > However, it exposes another problem (running the concurrent lockdep test
> 
> It exposes? How can this patch expose the below crash. AFAIR, you
> reported that you are getting below crash on plain mainline on TC2,
> i.e. for drivers with policy-per-governor set.
> 

Oh, simply because, without the NULL ref fix, I couldn't actually run
the test. Sorry if I was not clear.

> The reason is obvious, as the governor's sysfs directory is present
> cpus/cpuX/cpufreq/ instead of cpus/cpufreq/, which used to be the case
> without the flag. And this forces the show()/store() present in
> cpufreq.c to be called which also take policy->rwsem.
> 
> > that you merged in your tests). After the test is finished there is
> > always at least one task spinning. Do you think it might be related to
> > the race we are already discussing in the thread related to my cleanups
> > patches? This is what I see:
> 
> So this is what you reported earlier, right?
> 

Yep, same thing.

> > [   38.843648] other info that might help us debug this:
> > [   38.843648]
> > [   38.867627] Chain exists of:
> >   s_active#41 --> &policy->rwsem --> od_dbs_cdata.mutex
> > 
> > [   38.891693]  Possible unsafe locking scenario:
> > [   38.891693]
> 
> Will elaborate it a bit here..
> - CPU0 is calling governor's EXIT()
> - CPU1 is reading a governor file from sysfs
> 
> > [   38.909419]        CPU0                    CPU1
> > [   38.922978]        ----                    ----
> 
> Following needs to be added here..
> 
>                    EXIT-governor                read/write governor file
> 
>                                                 lock(s_active#41);
> 
> > [   38.936535]   lock(od_dbs_cdata.mutex);
> > [   38.948146]                                lock(&policy->rwsem);
> > [   38.966168]                                lock(od_dbs_cdata.mutex);
> > [   38.985219]   lock(s_active#41);
> > [   38.994923]
> > [   38.994923]  *** DEADLOCK ***
> 
> > Now, you already pointed me at a possible fix. I'm going to test that
> > (even if I have questions about that patch :)) and see if it makes this
> > go away. 
> 
> @Rafael: Juri is talking about this patch:
> 
> http://www.linux-arm.org/git?p=linux-jl.git;a=commit;h=d3eb02ed23732de2c8671377316a190c38b8fe93
> 

Right. Thanks for pointing Rafael to it.

> Juri, I thought it will fix it earlier (when I wrote it), but it never
> did on x86 (while I dropped the rwsem-drop-code around EXIT as well).
> 
> And I never came back to it and so never sent it upstream.
> 

kbuild robot didn't report anything bad yet. I'll run some more tests on
my x86 box anyway.

Best,

- Juri
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]