Re: Linux-omap PM: Fix dead lock condition in resource_release(vdd1_opp)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Mike Chan <mike@xxxxxxxxxxx> writes:

> On Tue, Aug 18, 2009 at 8:04 AM, Kevin
> Hilman<khilman@xxxxxxxxxxxxxxxxxxx> wrote:
>> "Wang Limei-E12499" <E12499@xxxxxxxxxxxx> writes:
>>
>>> Seems like I did not submit the patch in the right way, before I setup
>>> my mail correctly, attach the patches in the mail.
>>
>> You're patches are still line-wrapped.
>>
>> I strongly recommend using git-format-patch and git-send-email to
>> submit patches. Chunqiu was able to do this.  Please consult him.
>>
>> Also, no need to CC linux-omap-owner.  linux-omap is all that is needed.
>>
>
> This patch has been reviewed and merged into our android-omap-2.6.29 tree
> http://android.git.kernel.org/?p=kernel/omap.git;a=commit;h=0b6a9b6514c7ccfa0c76e4defdaea3dcbc617633

Hmm, I don't see any other review/signoff that the authors on that
link.

> Kevin if you're having line wrap problems feel free to pull it from
> here, assuming everyone's feedback has been addressed

It's not me who has the line-wrap problem. I could unwrap pretty
easily myself, but it gets very old working around various email
client problems, so I choose to reject patches until they can be sent
in a usable form. 

I'm still waiting for this so it can get a full review on-list.

Thanks,

Kevin

>
>> Thanks,
>>
>> Kevin
>>
>>
>>> PATCH1:0001-Add-per-resource-mutex-for-OMAP-resource-framework.patch
>>>
>>> From b4e9cc01f9d1aaeec39cc1ee794e5efaec61c781 Mon Sep 17 00:00:00 2001
>>> From: Chunqiu Wang <cqwang@xxxxxxxxxxxx>
>>> Date: Fri, 14 Aug 2009 17:34:32 +0800
>>> Subject: [PATCH] Add per-resource mutex for OMAP resource framework
>>>
>>> Current OMAP resource fwk uses a global res_mutex
>>> for resource_request and resource_release calls
>>> for all the available resources.It may cause dead
>>> lock if resource_request/resource_release is called
>>> recursively.
>>>
>>> For current OMAP3 VDD1/VDD2 resource, the change_level
>>> implementation is mach-omap2/resource34xx.c/set_opp(),
>>> when using resource_release to remove vdd1 constraint,
>>> this function may call resource_release again to release
>>> Vdd2 constrait if target vdd1 level is less than OPP3.
>>> in this scenario, the global res_mutex down operation
>>> will be called again, this will cause the second
>>> down operation hang there.
>>>
>>> To fix the problem, per-resource mutex is added
>>> to avoid hangup when resource_request/resource_release
>>> is called recursively.
>>>
>>> Signed-off-by: Chunqiu Wang <cqwang@xxxxxxxxxxxx>
>>> ---
>>>  arch/arm/plat-omap/include/mach/resource.h |    2 ++
>>>  arch/arm/plat-omap/resource.c              |   27
>>> +++++++++++++++------------
>>>  2 files changed, 17 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/arch/arm/plat-omap/include/mach/resource.h
>>> b/arch/arm/plat-omap/include/mach/resource.h
>>> index f91d8ce..d482fb8 100644
>>> --- a/arch/arm/plat-omap/include/mach/resource.h
>>> +++ b/arch/arm/plat-omap/include/mach/resource.h
>>> @@ -46,6 +46,8 @@ struct shared_resource {
>>>       /* Shared resource operations */
>>>       struct shared_resource_ops *ops;
>>>       struct list_head node;
>>> +     /* Protect each resource */
>>> +     struct mutex resource_mutex;
>>>  };
>>>
>>>  struct shared_resource_ops {
>>> diff --git a/arch/arm/plat-omap/resource.c
>>> b/arch/arm/plat-omap/resource.c
>>> index ec31727..5eae4e8 100644
>>> --- a/arch/arm/plat-omap/resource.c
>>> +++ b/arch/arm/plat-omap/resource.c
>>> @@ -264,6 +264,7 @@ int resource_register(struct shared_resource *resp)
>>>               return -EEXIST;
>>>
>>>       INIT_LIST_HEAD(&resp->users_list);
>>> +     mutex_init(&resp->resource_mutex);
>>>
>>>       down(&res_mutex);
>>>       /* Add the resource to the resource list */
>>> @@ -326,14 +327,14 @@ int resource_request(const char *name, struct
>>> device *dev,
>>>       struct  users_list *user;
>>>       int     found = 0, ret = 0;
>>>
>>> -     down(&res_mutex);
>>> -     resp = _resource_lookup(name);
>>> +     resp = resource_lookup(name);
>>>       if (!resp) {
>>>               printk(KERN_ERR "resource_request: Invalid resource
>>> name\n");
>>>               ret = -EINVAL;
>>> -             goto res_unlock;
>>> +             goto ret;
>>>       }
>>>
>>> +     mutex_lock(&resp->resource_mutex);
>>>       /* Call the resource specific validate function */
>>>       if (resp->ops->validate_level) {
>>>               ret = resp->ops->validate_level(resp, level);
>>> @@ -362,7 +363,7 @@ int resource_request(const char *name, struct device
>>> *dev,
>>>       user->level = level;
>>>
>>>  res_unlock:
>>> -     up(&res_mutex);
>>> +     mutex_unlock(&resp->resource_mutex);
>>>       /*
>>>        * Recompute and set the current level for the resource.
>>>        * NOTE: update_resource level moved out of spin_lock, as it may
>>> call
>>> @@ -371,6 +372,7 @@ res_unlock:
>>>        */
>>>       if (!ret)
>>>               ret = update_resource_level(resp);
>>> +ret:
>>>       return ret;
>>>  }
>>>  EXPORT_SYMBOL(resource_request);
>>> @@ -393,14 +395,14 @@ int resource_release(const char *name, struct
>>> device *dev)
>>>       struct users_list *user;
>>>       int found = 0, ret = 0;
>>>
>>> -     down(&res_mutex);
>>> -     resp = _resource_lookup(name);
>>> +     resp = resource_lookup(name);
>>>       if (!resp) {
>>>               printk(KERN_ERR "resource_release: Invalid resource
>>> name\n");
>>>               ret = -EINVAL;
>>> -             goto res_unlock;
>>> +             goto ret;
>>>       }
>>>
>>> +     mutex_lock(&resp->resource_mutex);
>>>       list_for_each_entry(user, &resp->users_list, node) {
>>>               if (user->dev == dev) {
>>>                       found = 1;
>>> @@ -421,7 +423,9 @@ int resource_release(const char *name, struct device
>>> *dev)
>>>       /* Recompute and set the current level for the resource */
>>>       ret = update_resource_level(resp);
>>>  res_unlock:
>>> -     up(&res_mutex);
>>> +     mutex_unlock(&resp->resource_mutex);
>>> +
>>> +ret:
>>>       return ret;
>>>  }
>>>  EXPORT_SYMBOL(resource_release);
>>> @@ -438,15 +442,14 @@ int resource_get_level(const char *name)
>>>       struct shared_resource *resp;
>>>       u32 ret;
>>>
>>> -     down(&res_mutex);
>>> -     resp = _resource_lookup(name);
>>> +     resp = resource_lookup(name);
>>>       if (!resp) {
>>>               printk(KERN_ERR "resource_release: Invalid resource
>>> name\n");
>>> -             up(&res_mutex);
>>>               return -EINVAL;
>>>       }
>>> +     mutex_lock(&resp->resource_mutex);
>>>       ret = resp->curr_level;
>>> -     up(&res_mutex);
>>> +     mutex_unlock(&resp->resource_mutex);
>>>       return ret;
>>>  }
>>>  EXPORT_SYMBOL(resource_get_level);
>>> --
>>> 1.5.4.3
>>>
>>> PATCH2:0002-Move-the-resource-level-update-into-mutex_lock-block.patch
>>>
>>>
>>> From 9cc371b5d7f2e049fe72bc946dcb8ec8e1de826c Mon Sep 17 00:00:00 2001
>>> From: Chunqiu Wang <cqwang@xxxxxxxxxxxx>
>>> Date: Fri, 14 Aug 2009 17:43:13 +0800
>>> Subject: [PATCH] Move the resource level update into mutex_lock block.
>>>
>>> The update_resource_level is called outside of
>>> the mutex lock protection block due to an out of date
>>> spin lock mechanism, now mutex is used, so move
>>> the update_resource_level into mutex protection block.
>>>
>>> Signed-off-by: Chunqiu Wang <cqwang@xxxxxxxxxxxx>
>>> ---
>>>  arch/arm/plat-omap/resource.c |   11 +++--------
>>>  1 files changed, 3 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/arm/plat-omap/resource.c
>>> b/arch/arm/plat-omap/resource.c
>>> index 5eae4e8..e2a003a 100644
>>> --- a/arch/arm/plat-omap/resource.c
>>> +++ b/arch/arm/plat-omap/resource.c
>>> @@ -362,16 +362,11 @@ int resource_request(const char *name, struct
>>> device *dev,
>>>       }
>>>       user->level = level;
>>>
>>> +     /* Recompute and set the current level for the resource */
>>> +     ret = update_resource_level(resp);
>>> +
>>>  res_unlock:
>>>       mutex_unlock(&resp->resource_mutex);
>>> -     /*
>>> -      * Recompute and set the current level for the resource.
>>> -      * NOTE: update_resource level moved out of spin_lock, as it may
>>> call
>>> -      * pm_qos_add_requirement, which does a kzmalloc. This won't be
>>> allowed
>>> -      * in iterrupt context. The spin_lock still protects add/remove
>>> users.
>>> -      */
>>> -     if (!ret)
>>> -             ret = update_resource_level(resp);
>>>  ret:
>>>       return ret;
>>>  }
>>> --
>>> 1.5.4.3
>>>
>>>
>>> -----Original Message-----
>>> From: Wang Limei-E12499
>>> Sent: Monday, August 17, 2009 11:13 AM
>>> To: 'khilman@xxxxxxxxxxxxxxxxxxx'
>>> Cc: Wang Limei-E12499; Wang Sawsd-A24013
>>> Subject: RE: Linux-omap PM: Fix dead lock condition in
>>> resource_release(vdd1_opp)
>>>
>>>
>>> Kevin,
>>>
>>> Seems like I did not submit the patch in the recommended way,I will try
>>> to be better in the future.
>>>
>>> If you can review  the patch and feedback, I will apperciate it.
>>>
>>> Thanks,
>>> Limei
>>>
>>> -----Original Message-----
>>> From: Wang Limei-E12499
>>> Sent: Friday, August 14, 2009 5:44 PM
>>> To: Kevin Hilman
>>> Cc: Romit Dasgupta; linux-omap@xxxxxxxxxxxxxxx; Wang Sawsd-A24013; Wang
>>> Limei-E12499
>>> Subject: RE: Linux-omap PM: Fix dead lock condition in
>>> resource_release(vdd1_opp)
>>>
>>>
>>> Kevin,
>>>
>>> Thanks for reviewing the patch.
>>>
>>> Chunqiu and I revised the patch. Pls see the attachment.
>>>
>>>
>>> Thanks,
>>> Limei,chunqiu
>>>
>>> -----Original Message-----
>>> From: Kevin Hilman [mailto:khilman@xxxxxxxxxxxxxxxxxxx]
>>> Sent: Thursday, August 13, 2009 8:02 AM
>>> To: Wang Limei-E12499
>>> Cc: Romit Dasgupta; linux-omap@xxxxxxxxxxxxxxx; Wang Sawsd-A24013
>>> Subject: Re: Linux-omap PM: Fix dead lock condition in
>>> resource_release(vdd1_opp)
>>>
>>> "Wang Limei-E12499" <E12499@xxxxxxxxxxxx> writes:
>>>
>>>>
>>>> Kevin and Romit,
>>>>
>>>> I agreed with you, thanks Kevin and Romit for the comments!   Chunqiu
>>>> Wang coded resource-based mutex, below is the patch. Pls review and
>>>> let us know your feedback.
>>>>
>>>>
>>>> From 31f87ffb8eb1f854a9adb7fd96011d490f4655fa Mon Sep 17 00:00:00 2001
>>>> From: Chunqiu Wang <cqwang@xxxxxxxxxxxx>
>>>> Date: Wed, 12 Aug 2009 16:22:09 +0800
>>>> Subject: [PATCH] Fix resource framework mutex lock issue when
>>>> resource_get or resource_release are called nestedly.
>>>>
>>>
>>> Could use a shorter summary (subject) and a more detailed changelog.
>>>
>>> This patch is doing too many things in a single patch without enough
>>> explanation.
>>>
>>> Not only does it convert the global semaphore to a resource-specific
>>> semaphore, but it also changing the locking slightly by moving some
>>> things in/out of lock protection.  That should be described in the
>>> changelog as well.
>>>
>>> Even better would be a first patch that simply converts the semaphore to
>>> a resource-specific *mutex* (not resource-specific semaphore.)  IOW, use
>>> mutex API in <linux/mutex.h>:
>>>
>>>   struct mutex;
>>>   init_mutex()
>>>   mutex_lock()
>>>   mutex_unlock()
>>>   mutex_is_lockec()
>>>   ...
>>>
>>> Then, add a 2nd patch which does any reworking of the critical sections.
>>>
>>> Kevin
>>>
>>>
>>>> Signed-off-by: Chunqiu Wang <cqwang@xxxxxxxxxxxx>
>>>> ---
>>>>  arch/arm/plat-omap/include/mach/resource.h |    2 +
>>>>  arch/arm/plat-omap/resource.c              |   38
>>>> +++++++++++++--------------
>>>>  2 files changed, 20 insertions(+), 20 deletions(-)
>>>>
>>>> diff --git a/arch/arm/plat-omap/include/mach/resource.h
>>>> b/arch/arm/plat-omap/include/mach/resource.h
>>>> index f91d8ce..389cb67 100644
>>>> --- a/arch/arm/plat-omap/include/mach/resource.h
>>>> +++ b/arch/arm/plat-omap/include/mach/resource.h
>>>> @@ -22,6 +22,7 @@
>>>>
>>>>  #include <linux/list.h>
>>>>  #include <linux/mutex.h>
>>>> +#include <linux/semaphore.h>
>>>>  #include <linux/device.h>
>>>>  #include <mach/cpu.h>
>>>>
>>>> @@ -46,6 +47,7 @@ struct shared_resource {
>>>>      /* Shared resource operations */
>>>>      struct shared_resource_ops *ops;
>>>>      struct list_head node;
>>>> +    struct semaphore resource_mutex;
>>>>  };
>>>>
>>>>  struct shared_resource_ops {
>>>> diff --git a/arch/arm/plat-omap/resource.c
>>>> b/arch/arm/plat-omap/resource.c index ec31727..758a138 100644
>>>> --- a/arch/arm/plat-omap/resource.c
>>>> +++ b/arch/arm/plat-omap/resource.c
>>>> @@ -264,6 +264,7 @@ int resource_register(struct shared_resource
>>> *resp)
>>>>              return -EEXIST;
>>>>
>>>>      INIT_LIST_HEAD(&resp->users_list);
>>>> +    sema_init(&resp->resource_mutex, 1);
>>>>
>>>>      down(&res_mutex);
>>>>      /* Add the resource to the resource list */ @@ -326,14 +327,14
>>> @@
>>>> int resource_request(const char *name, struct device *dev,
>>>>      struct  users_list *user;
>>>>      int     found = 0, ret = 0;
>>>>
>>>> -    down(&res_mutex);
>>>> -    resp = _resource_lookup(name);
>>>> +    resp = resource_lookup(name);
>>>>      if (!resp) {
>>>>              printk(KERN_ERR "resource_request: Invalid resource
>>> name\n");
>>>>              ret = -EINVAL;
>>>> -            goto res_unlock;
>>>> +            goto ret;
>>>>      }
>>>>
>>>> +    down(&resp->resource_mutex);
>>>>      /* Call the resource specific validate function */
>>>>      if (resp->ops->validate_level) {
>>>>              ret = resp->ops->validate_level(resp, level); @@ -361,16
>>> +362,12 @@
>>>> int resource_request(const char *name, struct device *dev,
>>>>      }
>>>>      user->level = level;
>>>>
>>>> +    /* Recompute and set the current level for the resource */
>>>> +    ret = update_resource_level(resp);
>>>>  res_unlock:
>>>> -    up(&res_mutex);
>>>> -    /*
>>>> -     * Recompute and set the current level for the resource.
>>>> -     * NOTE: update_resource level moved out of spin_lock, as it may
>>>> call
>>>> -     * pm_qos_add_requirement, which does a kzmalloc. This won't be
>>>> allowed
>>>> -     * in iterrupt context. The spin_lock still protects add/remove
>>>> users.
>>>> -     */
>>>> -    if (!ret)
>>>> -            ret = update_resource_level(resp);
>>>> +    up(&resp->resource_mutex);
>>>> +
>>>> +ret:
>>>>      return ret;
>>>>  }
>>>>  EXPORT_SYMBOL(resource_request);
>>>> @@ -393,14 +390,14 @@ int resource_release(const char *name, struct
>>>> device *dev)
>>>>      struct users_list *user;
>>>>      int found = 0, ret = 0;
>>>>
>>>> -    down(&res_mutex);
>>>> -    resp = _resource_lookup(name);
>>>> +    resp = resource_lookup(name);
>>>>      if (!resp) {
>>>>              printk(KERN_ERR "resource_release: Invalid resource
>>> name\n");
>>>>              ret = -EINVAL;
>>>> -            goto res_unlock;
>>>> +            goto ret;
>>>>      }
>>>>
>>>> +    down(&resp->resource_mutex);
>>>>      list_for_each_entry(user, &resp->users_list, node) {
>>>>              if (user->dev == dev) {
>>>>                      found = 1;
>>>> @@ -421,7 +418,9 @@ int resource_release(const char *name, struct
>>>> device
>>>> *dev)
>>>>      /* Recompute and set the current level for the resource */
>>>>      ret = update_resource_level(resp);
>>>>  res_unlock:
>>>> -    up(&res_mutex);
>>>> +    up(&resp->resource_mutex);
>>>> +
>>>> +ret:
>>>>      return ret;
>>>>  }
>>>>  EXPORT_SYMBOL(resource_release);
>>>> @@ -438,15 +437,14 @@ int resource_get_level(const char *name)
>>>>      struct shared_resource *resp;
>>>>      u32 ret;
>>>>
>>>> -    down(&res_mutex);
>>>> -    resp = _resource_lookup(name);
>>>> +    resp = resource_lookup(name);
>>>>      if (!resp) {
>>>>              printk(KERN_ERR "resource_release: Invalid resource
>>> name\n");
>>>> -            up(&res_mutex);
>>>>              return -EINVAL;
>>>>      }
>>>> +    down(&resp->resource_mutex);
>>>>      ret = resp->curr_level;
>>>> -    up(&res_mutex);
>>>> +    up(&resp->resource_mutex);
>>>>      return ret;
>>>>  }
>>>>  EXPORT_SYMBOL(resource_get_level);
>>>> --
>>>> 1.5.4.3
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Kevin Hilman [mailto:khilman@xxxxxxxxxxxxxxxxxxx]
>>>> Sent: Monday, August 10, 2009 11:23 AM
>>>> To: Wang Limei-E12499
>>>> Cc: linux-omap@xxxxxxxxxxxxxxx
>>>> Subject: Re: Linux-omap PM: Fix dead lock condition in
>>>> resource_release(vdd1_opp)
>>>>
>>>> "Wang Limei-E12499" <E12499@xxxxxxxxxxxx> writes:
>>>>
>>>>> I am using linux-omap pm-2.6.29
>>>>> <http://git.kernel.org/?p=linux/kernel/git/khilman/linux-omap-pm.git;
>>>>> a =s hortlog;h=pm-2.6.29>  branch,found a dead lock condition in:
>>>>> arch/arm/plat-omap/resource.c->resource_release().
>>>>>
>>>>> The dead lock happens when using
>>>>> resource_request("vdd1_opp",&dev,...)
>>>>> and resource_release("vdd1_opp", &dev) to set and release vdd1 opp
>>>>> constraint. The  scenario is:
>>>>>
>>>>> plat-omap/resource.c/resource_release("vdd1_opp",
>>>>> &dev)==>resource.c/update_resource_level()=>mach-omap2/resource34xx.c
>>>>> / se t_opp().  In set_opp(),  if the target_level of vdd1 is less
>>>>> than OPP3,will release the constraint set on VDD2 by calling
>>>>> resource_release(), but it will never return because can not get the
>>>>> mutex which is holding  by the previous caller.
>>>>>
>>>>> int resource_release(const char *name, struct device *dev)
>>>>> {           .......
>>>>>     down(&res_mutex);
>>>>>     ........
>>>>>     /* Recompute and set the current level for the resource */
>>>>>     ret = update_resource_level(resp);
>>>>> res_unlock:
>>>>>     up(&res_mutex);
>>>>>     return ret;
>>>>> }
>>>>>
>>>>> int set_opp(struct shared_resource *resp, u32 target_level) {
>>>>>     .....
>>>>>  if (resp == vdd1_resp) {
>>>>>       if (target_level < 3)
>>>>>            resource_release("vdd2_opp", &vdd2_dev); }
>>>>>
>>>>> The patch to fix this issue is below, will you pls review it and let
>>>>> me know your feedback?
>>>>>
>>>>> From: Limei Wang <e12499@xxxxxxxxxxxx>
>>>>> Date: Fri, 7 Aug 2009 11:40:35 -0500
>>>>> Subject: [PATCH] OMAP PM: Fix dead lock bug in
>>>>> resourc_release(vdd1_opp).
>>>>>
>>>>>
>>>>> Signed-off-by: Limei Wang <e12499@xxxxxxxxxxxx>
>>>>> ---
>>>>>  arch/arm/plat-omap/resource.c |    6 ++++--
>>>>>  1 files changed, 4 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm/plat-omap/resource.c
>>>>> b/arch/arm/plat-omap/resource.c index ec31727..876fd12 100644
>>>>> --- a/arch/arm/plat-omap/resource.c
>>>>> +++ b/arch/arm/plat-omap/resource.c
>>>>> @@ -418,10 +418,12 @@ int resource_release(const char *name, struct
>>>>> device *dev)
>>>>>     list_del(&user->node);
>>>>>     free_user(user);
>>>>>
>>>>> -   /* Recompute and set the current level for the resource */
>>>>> -   ret = update_resource_level(resp);
>>>>>  res_unlock:
>>>>>     up(&res_mutex);
>>>>> +
>>>>> +   /* Recompute and set the current level for the resource */
>>>>> +   if (!ret)
>>>>> +       ret = update_resource_level(resp);
>>>>>     return ret;
>>>>>  }
>>>>>  EXPORT_SYMBOL(resource_release);
>>>>> --
>>>>> 1.5.6.3
>>>>
>>>> This is wrong for several reasons.
>>>>
>>>> First, you're not fixing the problem, you're just moving the call
>>>> outside of the lock, thus creating other locking problems.
>>>>
>>>> Second, the various error paths would break because
>>>> update_resource_level() would be called on the 'res_unlock' error path
>>>
>>>> where it is not currently being called.
>>>>
>>>> A per-resource mutex as suggest by Romit seems like the right approach
>>>
>>>> to fixing this problem.
>>>>
>>>> Kevin
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux