Re: [PATCH v9 2/4] PM: Introduce devfreq: generic DVFS framework with device-specific OPPs

MyungJoo Ham <myungjoo.ham@xxxxxxxxxxx> · Thu, 1 Sep 2011 13:51:17 +0900

On Thu, Sep 1, 2011 at 5:05 AM, Turquette, Mike <mturquette@xxxxxx> wrote:
> On Wed, Aug 31, 2011 at 12:29 AM, MyungJoo Ham <myungjoo.ham@xxxxxxxxxxx> wrote:
> [snip]
>> +/**
>> + * get_devfreq() - find devfreq struct. a wrapped find_device_devfreq()
>> + *             with mutex protection. exported for governors
>> + * @dev:       device pointer used to lookup device devfreq.
>> + */
>> +struct devfreq *get_devfreq(struct device *dev)
>> +{
>> +       struct devfreq *ret;
>> +
>> +       mutex_lock(&devfreq_list_lock);
>> +       ret = find_device_devfreq(dev);
>> +       mutex_unlock(&devfreq_list_lock);
>
> You prevent changes to the devfreq list while searching (good) but
> after returning the pointer there is no protection from that item
> being removed from the list.  Generally "get" and "put" functions do
> more than just return a pointer: get functions often increment a
> refcount, or hold a lock.  The put function will decrement the
> refcount or release the lock.  Maybe you want something like the
> following:
>
> mutex_lock(&devfreq_list_lock);
> ret = find_device_devfreq(dev);
> mutex_lock(&devfreq->lock);
> mutex_unlock(&devfreq_list_lock);
>
> Then you need a corresponding put which does the mutex_unlock(&devfreq->lock).
>
> It looks like the only consumers of get_devfreq are the sysfs
> show/store interfaces, which immediately hold devfreq->lock, so the
> above proposal certainly makes fits the existing use case.
>
> Also CPUfreq's "cpufreq_get" function does a nice job of protecting
> the object from getting freed with a rw_semaphore.  It is a bit more
> "complicated" but makes for good reading.
>

Thank you. The possibility that someone may do something on struct
devfreq after mutex_unlock(&devfreq_list_lock) and before
mutex_lock(&devfreq->lock) bothered me and it appears that I need to
add devfreq_put() anyway. I hesitated it because users might forget
using devfreq_put() after calling devfreq_get(); however, it is just
same as forgetting mutex_unlock after mutex_lock. So I wouldn't mind
that much.

The next devfreq_get() will do
> mutex_lock(&devfreq_list_lock);
> ret = find_device_devfreq(dev);
> mutex_lock(&devfreq->lock);
> mutex_unlock(&devfreq_list_lock);

and devfreq_put() will do
> mutex_unlock(&devfreq->lock);

as you've suggested.

>> +static int devfreq_do(struct devfreq *devfreq)
>> +{
>> +       struct opp *opp;
>> +       unsigned long freq;
>> +       int err;
>> +
>
> Put the mutex_is_locked check here?  See more below.
>
[]
>> +static int devfreq_update(struct notifier_block *nb, unsigned long type,
>> +                         void *devp)
>> +{
>> +       struct devfreq *devfreq = container_of(nb, struct devfreq, nb);
>> +       int ret;
>> +
>> +       mutex_lock(&devfreq->lock);
>> +       ret = update_devfreq(devfreq);
>> +       mutex_unlock(&devfreq->lock);
>
> The whole devfreq_update/update_devfreq pairing is redundant.
> update_devfreq's purpose is to make sure the lock is held before going
> further, and the only caller of update_devfreq is devfreq_update which
> always holds the lock.
>
> This still doesn't stop a bad driver writer from just calling
> devfreq_do with an extern.  Perhaps the lock detection should be moved
> into devfreq_do and update_devfreq should go away?
>

- update_devfreq: extern for governors
- devfreq_update: notifier callback for OPP
- devfreq_do: internal function of devfreq.

Anyway, as you've mentioned, it seems I'd better rename devfreq_do as
update_devfreq and make it exported with mutex check.

[]
>> +static void devfreq_monitor(struct work_struct *work)
>> +{
>> +       static unsigned long last_polled_at;
>> +       struct devfreq *devfreq, *tmp;
>> +       int error;
>> +       unsigned long jiffies_passed;
>> +       unsigned long next_jiffies = ULONG_MAX, now = jiffies;
>> +
>> +       /* Initially last_polled_at = 0, polling every device at bootup */
>> +       jiffies_passed = now - last_polled_at;
>> +       last_polled_at = now;
>> +       if (jiffies_passed == 0)
>> +               jiffies_passed = 1;
>> +
>> +       mutex_lock(&devfreq_list_lock);
>
> Should not lock the list here.  If we lock the list for all major
> operations, it nullifies the performance benefit of having a mutex in
> struct devfreq.
>

Ok... then.. how about locking like this? :

mutex_lock(&devfreq_list_lock);
       list_for_each_entry_safe(devfreq, tmp, &devfreq_list, node) {
               mutex_lock(&devfreq->lock);
               mutex_unlock(&devfreq_list_lock);

               blahblah

               mutex_unlock(&devfreq->lock);
               mutex_lock(&devfreq_list_lock);
      }
mutex_unlock(&devfreq_list_lock);

Anyway, there is one more problem with allowing
unlocked-devfreq_list_lock in the loop.

list_for_each_entry_safe(devfreq, tmp, &devfreq_list, node) is safe
for the removal of devfreq from the list in the loop.
However, it is not safe against the removal of devfreq's next member
in the loop and while devfreq_list_lock is unlocked,
devfreq_remove_device  may remove that one; thus, breaking the
list_for_each_entry_safe loop.

Such break is prevent by adding one more mutex_lock/mutex_unlock to
the loop for tmp->lock. However, if we do mutex_unlock(&tmp->lock)
before mutex_lock(&devfreq_list_lock), we still have the same
breaking-the-loop issue and if we do it after
mutex_lock(&devfreq_list_lock), we have a deadlock issue (someone
might have locked devfreq_list_lock and waiting to lock tmp->lock).

Thus, we will need to block devfreq_remove_device at devfreq_monitor
whlie unlocking devfreq_list in the loop. Other operations (add /
list) on the list are fine for it.

So, the loop will be:

mutex_lock(&devfreq_list_lock);

       prohibit_devfreq_remove = true;

       list_for_each_entry_safe(devfreq, tmp, &devfreq_list, node) {
               mutex_lock(&devfreq->lock);
               mutex_unlock(&devfreq_list_lock);

               blahblah

               mutex_unlock(&devfreq->lock);
               mutex_lock(&devfreq_list_lock);
      }

       prohibit_devfreq_remove = false;

mutex_unlock(&devfreq_list_lock);

[]
>> +                       /* Remove a devfreq with an error. */
>> +                       if (error && error != -EAGAIN) {
>> +                               dev_err(devfreq->dev, "Due to devfreq_do error(%d), devfreq(%s) is removed from the device\n",
>> +                                       error, devfreq->governor->name);
>> +
>> +                               list_del(&devfreq->node);
>> +                               mutex_unlock(&devfreq->lock);
>> +                               kfree(devfreq);
>> +                               continue;
>
> Should this error handling also unregister the OPP notifier?  This
> code duplicates portions of devfreq_remove_device.  I propose instead
> here we do the following:
>
> mutex_unlock(&devfreq->lock);
> _devfreq_remove_lock(devfreq);
> /* this locks the list first,
>  * then locks devfreq->lock,
>  * then does the house cleaning
>  */
> continue;

Ah.. that was missing. Thanks!

I'll make a _devfreq_remove_device(struct device *dev, bool
call_by_monitor) and let devfreq_remove_device use it.

>
>> +                       }
>> +                       devfreq->next_polling = devfreq->polling_jiffies;
>> +
>> +                       /* No more polling required (polling_ms changed) */
>> +                       if (devfreq->next_polling == 0) {
>> +                               mutex_unlock(&devfreq->lock);
>> +                               continue;
>> +                       }
>> +               } else {
>> +                       devfreq->next_polling -= jiffies_passed;
>> +               }
>> +
>> +               next_jiffies = (next_jiffies > devfreq->next_polling) ?
>> +                               devfreq->next_polling : next_jiffies;
>> +
>> +               mutex_unlock(&devfreq->lock);
>> +       }
>> +
>> +       if (next_jiffies > 0 && next_jiffies < ULONG_MAX) {
>> +               polling = true;
>> +               queue_delayed_work(devfreq_wq, &devfreq_work, next_jiffies);
>> +       } else {
>> +               polling = false;
>> +       }
>> +
>> +       mutex_unlock(&devfreq_list_lock);
>
> Again, list should not be locked over this whole function.  It blocks
> other unrelated devfreq devices from scaling.
>
>> +}
>
> [snip]
>
>> +/**
>> + * devfreq_remove_device() - Remove devfreq feature from a device.
>> + * @device:    the device to remove devfreq feature.
>> + */
>> +int devfreq_remove_device(struct device *dev)
>
> Why does this take a struct device*?  Shouldn't it be a struct devfreq*?

It is because devfreq_add_device() does not return struct devfreq.
struct devfreq is not visible to device drivers. It is visible to
governors.

>
> If there is a case for removing a devfreq device with only struct
> device* as input, how about:
>
> int devfreq_remove_device(struct device *dev)
> {
>    struct devfreq *devfreq;
>    mutex_lock(&devfreq_list_lock);
>    devfreq = find_device_devfreq(dev);
>    mutex_unlock(&devfreq_list_lock);
>
>    return _devfreq_remove_device(struct devfreq *df);
>    /* _devfreq_remove_device does the real work and can also be
> called from devfreq_monitor */
> }

Sure, I'll do so and reduce the redundancy.

>
> Regards,
> Mike
>
[]

Thank you so much!

Cheers,
MyungJoo

-- 
MyungJoo Ham (함명주), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
_______________________________________________
linux-pm mailing list
linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linux-foundation.org/mailman/listinfo/linux-pm