Re: [PATCH] cpufreq: Set cpufreq_cpu_data to NULL before putting kobject

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2015/1/30 10:13, Viresh Kumar wrote:
On 30 January 2015 at 07:40, ethan zhao <ethan.zhao@xxxxxxxxxx> wrote:
For a PPC notification and xen-bus thread race, could you tell me a way how
to reproduce it by trigger the PPC notification and xen-bus events manually
?
You really want me write some code into a test kernel to flood the PPC and
xen-bus at the same time ? if we could analysis code and get the issue
clearly, we wouldn't wait the users to yell out.
I thought you already have a test where you are hitting the issue you originally
reported. Atleast Santosh did confirm that he is hitting 3/5 times in his kernel
during boot..
As I know, PPC notification only happens when power capping needed, maybe the server over-hot, if the cooling condition recover, you couldn't reproduce it either !.

My reasoning of why your observation doesn't fit here:

Copying from your earlier mail..

  Thread A: Workqueue: kacpi_notify

  acpi_processor_notify()
    acpi_processor_ppc_has_changed()
          cpufreq_update_policy()
            cpufreq_cpu_get()
              kobject_get()

This tries to increment the count and the warning you have mentioned
happen because:

WARN_ON_ONCE(atomic_inc_return(&kref->refcount) < 2);

i.e. even after incrementing the count, it is < 2. Which I believe will be
1. Which means that we have tried to do kobject_get() on a kobject
for which kobject_put() is already done.

  Thread B: xenbus_thread()

  xenbus_thread()
    msg->u.watch.handle->callback()
      handle_vcpu_hotplug_event()
        vcpu_hotplug()
          cpu_down()
            __cpu_notify(CPU_DOWN_PREPARE..)
              cpufreq_cpu_callback()
                __cpufreq_remove_dev_prepare()
                  update_policy_cpu()
                    kobject_move()


Okay, where is the race or kobject_put() here ? We are just moving
the kobject and it has nothing to do with the refcount of kobject.

Why do you see its a race ?
 I mean the policy->cpu has been changed, that CPU is about to be down,
 Thread A continue to get and update the policy for it blindly, that is
 what I Say 'race', not the refcount itself.

 Thanks,
 Ethan

--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]