Hi Mike, Actually chunqiu and I still have this problem, if you can push out the patch, that will be good. Thanks, Limei -----Original Message----- From: Mike Chan [mailto:mike@xxxxxxxxxxx] Sent: Thursday, September 03, 2009 12:45 PM To: Kevin Hilman Cc: Wang Limei-E12499; linux-omap@xxxxxxxxxxxxxxx; Wang Sawsd-A24013 Subject: Re: Linux-omap PM: Fix dead lock condition in resource_release(vdd1_opp) On Thu, Sep 3, 2009 at 7:01 AM, Kevin Hilman<khilman@xxxxxxxxxxxxxxxxxxx> wrote: > Mike Chan <mike@xxxxxxxxxxx> writes: > >> On Tue, Aug 18, 2009 at 8:04 AM, Kevin >> Hilman<khilman@xxxxxxxxxxxxxxxxxxx> wrote: >>> "Wang Limei-E12499" <E12499@xxxxxxxxxxxx> writes: >>> >>>> Seems like I did not submit the patch in the right way, before I >>>> setup my mail correctly, attach the patches in the mail. >>> >>> You're patches are still line-wrapped. >>> >>> I strongly recommend using git-format-patch and git-send-email to >>> submit patches. Chunqiu was able to do this. Please consult him. >>> >>> Also, no need to CC linux-omap-owner. linux-omap is all that is needed. >>> >> >> This patch has been reviewed and merged into our android-omap-2.6.29 >> tree >> http://android.git.kernel.org/?p=kernel/omap.git;a=commit;h=0b6a9b651 >> 4c7ccfa0c76e4defdaea3dcbc617633 > > Hmm, I don't see any other review/signoff that the authors on that > link. > Apologies, I was not aware of the reviewed-by policy but will keep that in mind for future patches we pull into our branch. In general any patches that have been applied to the android-omap-2.6.29 tree have gone under some review/testing. >> Kevin if you're having line wrap problems feel free to pull it from >> here, assuming everyone's feedback has been addressed > > It's not me who has the line-wrap problem. I could unwrap pretty > easily myself, but it gets very old working around various email > client problems, so I choose to reject patches until they can be sent > in a usable form. > > I'm still waiting for this so it can get a full review on-list. > Very understandable, if Chunqiu is still having problems I can push one out to l-o for review on behalf of Chinqiu. Its in our best interest to get this into mainline so we have less 1-off's to maintain. --Mike > Thanks, > > Kevin > >> >>> Thanks, >>> >>> Kevin >>> >>> >>>> PATCH1:0001-Add-per-resource-mutex-for-OMAP-resource-framework.patc >>>> h >>>> >>>> From b4e9cc01f9d1aaeec39cc1ee794e5efaec61c781 Mon Sep 17 00:00:00 >>>> 2001 >>>> From: Chunqiu Wang <cqwang@xxxxxxxxxxxx> >>>> Date: Fri, 14 Aug 2009 17:34:32 +0800 >>>> Subject: [PATCH] Add per-resource mutex for OMAP resource framework >>>> >>>> Current OMAP resource fwk uses a global res_mutex for >>>> resource_request and resource_release calls for all the available >>>> resources.It may cause dead lock if >>>> resource_request/resource_release is called recursively. >>>> >>>> For current OMAP3 VDD1/VDD2 resource, the change_level >>>> implementation is mach-omap2/resource34xx.c/set_opp(), >>>> when using resource_release to remove vdd1 constraint, this >>>> function may call resource_release again to release >>>> Vdd2 constrait if target vdd1 level is less than OPP3. >>>> in this scenario, the global res_mutex down operation will be >>>> called again, this will cause the second down operation hang there. >>>> >>>> To fix the problem, per-resource mutex is added to avoid hangup >>>> when resource_request/resource_release is called recursively. >>>> >>>> Signed-off-by: Chunqiu Wang <cqwang@xxxxxxxxxxxx> >>>> --- >>>> arch/arm/plat-omap/include/mach/resource.h | 2 ++ >>>> arch/arm/plat-omap/resource.c | 27 >>>> +++++++++++++++------------ >>>> 2 files changed, 17 insertions(+), 12 deletions(-) >>>> >>>> diff --git a/arch/arm/plat-omap/include/mach/resource.h >>>> b/arch/arm/plat-omap/include/mach/resource.h >>>> index f91d8ce..d482fb8 100644 >>>> --- a/arch/arm/plat-omap/include/mach/resource.h >>>> +++ b/arch/arm/plat-omap/include/mach/resource.h >>>> @@ -46,6 +46,8 @@ struct shared_resource { >>>> /* Shared resource operations */ >>>> struct shared_resource_ops *ops; >>>> struct list_head node; >>>> + /* Protect each resource */ >>>> + struct mutex resource_mutex; >>>> }; >>>> >>>> struct shared_resource_ops { >>>> diff --git a/arch/arm/plat-omap/resource.c >>>> b/arch/arm/plat-omap/resource.c index ec31727..5eae4e8 100644 >>>> --- a/arch/arm/plat-omap/resource.c >>>> +++ b/arch/arm/plat-omap/resource.c >>>> @@ -264,6 +264,7 @@ int resource_register(struct shared_resource >>>> *resp) >>>> return -EEXIST; >>>> >>>> INIT_LIST_HEAD(&resp->users_list); >>>> + mutex_init(&resp->resource_mutex); >>>> >>>> down(&res_mutex); >>>> /* Add the resource to the resource list */ @@ -326,14 >>>> +327,14 @@ int resource_request(const char *name, struct device >>>> *dev, >>>> struct users_list *user; >>>> int found = 0, ret = 0; >>>> >>>> - down(&res_mutex); >>>> - resp = _resource_lookup(name); >>>> + resp = resource_lookup(name); >>>> if (!resp) { >>>> printk(KERN_ERR "resource_request: Invalid resource >>>> name\n"); >>>> ret = -EINVAL; >>>> - goto res_unlock; >>>> + goto ret; >>>> } >>>> >>>> + mutex_lock(&resp->resource_mutex); >>>> /* Call the resource specific validate function */ >>>> if (resp->ops->validate_level) { >>>> ret = resp->ops->validate_level(resp, level); @@ >>>> -362,7 +363,7 @@ int resource_request(const char *name, struct >>>> device *dev, >>>> user->level = level; >>>> >>>> res_unlock: >>>> - up(&res_mutex); >>>> + mutex_unlock(&resp->resource_mutex); >>>> /* >>>> * Recompute and set the current level for the resource. >>>> * NOTE: update_resource level moved out of spin_lock, as it >>>> may call @@ -371,6 +372,7 @@ res_unlock: >>>> */ >>>> if (!ret) >>>> ret = update_resource_level(resp); >>>> +ret: >>>> return ret; >>>> } >>>> EXPORT_SYMBOL(resource_request); >>>> @@ -393,14 +395,14 @@ int resource_release(const char *name, struct >>>> device *dev) >>>> struct users_list *user; >>>> int found = 0, ret = 0; >>>> >>>> - down(&res_mutex); >>>> - resp = _resource_lookup(name); >>>> + resp = resource_lookup(name); >>>> if (!resp) { >>>> printk(KERN_ERR "resource_release: Invalid resource >>>> name\n"); >>>> ret = -EINVAL; >>>> - goto res_unlock; >>>> + goto ret; >>>> } >>>> >>>> + mutex_lock(&resp->resource_mutex); >>>> list_for_each_entry(user, &resp->users_list, node) { >>>> if (user->dev == dev) { >>>> found = 1; >>>> @@ -421,7 +423,9 @@ int resource_release(const char *name, struct >>>> device >>>> *dev) >>>> /* Recompute and set the current level for the resource */ >>>> ret = update_resource_level(resp); >>>> res_unlock: >>>> - up(&res_mutex); >>>> + mutex_unlock(&resp->resource_mutex); >>>> + >>>> +ret: >>>> return ret; >>>> } >>>> EXPORT_SYMBOL(resource_release); >>>> @@ -438,15 +442,14 @@ int resource_get_level(const char *name) >>>> struct shared_resource *resp; >>>> u32 ret; >>>> >>>> - down(&res_mutex); >>>> - resp = _resource_lookup(name); >>>> + resp = resource_lookup(name); >>>> if (!resp) { >>>> printk(KERN_ERR "resource_release: Invalid resource >>>> name\n"); >>>> - up(&res_mutex); >>>> return -EINVAL; >>>> } >>>> + mutex_lock(&resp->resource_mutex); >>>> ret = resp->curr_level; >>>> - up(&res_mutex); >>>> + mutex_unlock(&resp->resource_mutex); >>>> return ret; >>>> } >>>> EXPORT_SYMBOL(resource_get_level); >>>> -- >>>> 1.5.4.3 >>>> >>>> PATCH2:0002-Move-the-resource-level-update-into-mutex_lock-block.pa >>>> tch >>>> >>>> >>>> From 9cc371b5d7f2e049fe72bc946dcb8ec8e1de826c Mon Sep 17 00:00:00 >>>> 2001 >>>> From: Chunqiu Wang <cqwang@xxxxxxxxxxxx> >>>> Date: Fri, 14 Aug 2009 17:43:13 +0800 >>>> Subject: [PATCH] Move the resource level update into mutex_lock block. >>>> >>>> The update_resource_level is called outside of the mutex lock >>>> protection block due to an out of date spin lock mechanism, now >>>> mutex is used, so move the update_resource_level into mutex >>>> protection block. >>>> >>>> Signed-off-by: Chunqiu Wang <cqwang@xxxxxxxxxxxx> >>>> --- >>>> arch/arm/plat-omap/resource.c | 11 +++-------- >>>> 1 files changed, 3 insertions(+), 8 deletions(-) >>>> >>>> diff --git a/arch/arm/plat-omap/resource.c >>>> b/arch/arm/plat-omap/resource.c index 5eae4e8..e2a003a 100644 >>>> --- a/arch/arm/plat-omap/resource.c >>>> +++ b/arch/arm/plat-omap/resource.c >>>> @@ -362,16 +362,11 @@ int resource_request(const char *name, struct >>>> device *dev, >>>> } >>>> user->level = level; >>>> >>>> + /* Recompute and set the current level for the resource */ >>>> + ret = update_resource_level(resp); >>>> + >>>> res_unlock: >>>> mutex_unlock(&resp->resource_mutex); >>>> - /* >>>> - * Recompute and set the current level for the resource. >>>> - * NOTE: update_resource level moved out of spin_lock, as it >>>> may call >>>> - * pm_qos_add_requirement, which does a kzmalloc. This won't >>>> be allowed >>>> - * in iterrupt context. The spin_lock still protects >>>> add/remove users. >>>> - */ >>>> - if (!ret) >>>> - ret = update_resource_level(resp); >>>> ret: >>>> return ret; >>>> } >>>> -- >>>> 1.5.4.3 >>>> >>>> >>>> -----Original Message----- >>>> From: Wang Limei-E12499 >>>> Sent: Monday, August 17, 2009 11:13 AM >>>> To: 'khilman@xxxxxxxxxxxxxxxxxxx' >>>> Cc: Wang Limei-E12499; Wang Sawsd-A24013 >>>> Subject: RE: Linux-omap PM: Fix dead lock condition in >>>> resource_release(vdd1_opp) >>>> >>>> >>>> Kevin, >>>> >>>> Seems like I did not submit the patch in the recommended way,I will >>>> try to be better in the future. >>>> >>>> If you can review the patch and feedback, I will apperciate it. >>>> >>>> Thanks, >>>> Limei >>>> >>>> -----Original Message----- >>>> From: Wang Limei-E12499 >>>> Sent: Friday, August 14, 2009 5:44 PM >>>> To: Kevin Hilman >>>> Cc: Romit Dasgupta; linux-omap@xxxxxxxxxxxxxxx; Wang Sawsd-A24013; >>>> Wang >>>> Limei-E12499 >>>> Subject: RE: Linux-omap PM: Fix dead lock condition in >>>> resource_release(vdd1_opp) >>>> >>>> >>>> Kevin, >>>> >>>> Thanks for reviewing the patch. >>>> >>>> Chunqiu and I revised the patch. Pls see the attachment. >>>> >>>> >>>> Thanks, >>>> Limei,chunqiu >>>> >>>> -----Original Message----- >>>> From: Kevin Hilman [mailto:khilman@xxxxxxxxxxxxxxxxxxx] >>>> Sent: Thursday, August 13, 2009 8:02 AM >>>> To: Wang Limei-E12499 >>>> Cc: Romit Dasgupta; linux-omap@xxxxxxxxxxxxxxx; Wang Sawsd-A24013 >>>> Subject: Re: Linux-omap PM: Fix dead lock condition in >>>> resource_release(vdd1_opp) >>>> >>>> "Wang Limei-E12499" <E12499@xxxxxxxxxxxx> writes: >>>> >>>>> >>>>> Kevin and Romit, >>>>> >>>>> I agreed with you, thanks Kevin and Romit for the comments! >>>>> Chunqiu Wang coded resource-based mutex, below is the patch. Pls >>>>> review and let us know your feedback. >>>>> >>>>> >>>>> From 31f87ffb8eb1f854a9adb7fd96011d490f4655fa Mon Sep 17 00:00:00 >>>>> 2001 >>>>> From: Chunqiu Wang <cqwang@xxxxxxxxxxxx> >>>>> Date: Wed, 12 Aug 2009 16:22:09 +0800 >>>>> Subject: [PATCH] Fix resource framework mutex lock issue when >>>>> resource_get or resource_release are called nestedly. >>>>> >>>> >>>> Could use a shorter summary (subject) and a more detailed changelog. >>>> >>>> This patch is doing too many things in a single patch without >>>> enough explanation. >>>> >>>> Not only does it convert the global semaphore to a >>>> resource-specific semaphore, but it also changing the locking >>>> slightly by moving some things in/out of lock protection. That >>>> should be described in the changelog as well. >>>> >>>> Even better would be a first patch that simply converts the >>>> semaphore to a resource-specific *mutex* (not resource-specific >>>> semaphore.) IOW, use mutex API in <linux/mutex.h>: >>>> >>>> struct mutex; >>>> init_mutex() >>>> mutex_lock() >>>> mutex_unlock() >>>> mutex_is_lockec() >>>> ... >>>> >>>> Then, add a 2nd patch which does any reworking of the critical sections. >>>> >>>> Kevin >>>> >>>> >>>>> Signed-off-by: Chunqiu Wang <cqwang@xxxxxxxxxxxx> >>>>> --- >>>>> arch/arm/plat-omap/include/mach/resource.h | 2 + >>>>> arch/arm/plat-omap/resource.c | 38 >>>>> +++++++++++++-------------- >>>>> 2 files changed, 20 insertions(+), 20 deletions(-) >>>>> >>>>> diff --git a/arch/arm/plat-omap/include/mach/resource.h >>>>> b/arch/arm/plat-omap/include/mach/resource.h >>>>> index f91d8ce..389cb67 100644 >>>>> --- a/arch/arm/plat-omap/include/mach/resource.h >>>>> +++ b/arch/arm/plat-omap/include/mach/resource.h >>>>> @@ -22,6 +22,7 @@ >>>>> >>>>> #include <linux/list.h> >>>>> #include <linux/mutex.h> >>>>> +#include <linux/semaphore.h> >>>>> #include <linux/device.h> >>>>> #include <mach/cpu.h> >>>>> >>>>> @@ -46,6 +47,7 @@ struct shared_resource { >>>>> /* Shared resource operations */ >>>>> struct shared_resource_ops *ops; >>>>> struct list_head node; >>>>> + struct semaphore resource_mutex; >>>>> }; >>>>> >>>>> struct shared_resource_ops { >>>>> diff --git a/arch/arm/plat-omap/resource.c >>>>> b/arch/arm/plat-omap/resource.c index ec31727..758a138 100644 >>>>> --- a/arch/arm/plat-omap/resource.c >>>>> +++ b/arch/arm/plat-omap/resource.c >>>>> @@ -264,6 +264,7 @@ int resource_register(struct shared_resource >>>> *resp) >>>>> return -EEXIST; >>>>> >>>>> INIT_LIST_HEAD(&resp->users_list); >>>>> + sema_init(&resp->resource_mutex, 1); >>>>> >>>>> down(&res_mutex); >>>>> /* Add the resource to the resource list */ @@ -326,14 >>>>> +327,14 >>>> @@ >>>>> int resource_request(const char *name, struct device *dev, >>>>> struct users_list *user; >>>>> int found = 0, ret = 0; >>>>> >>>>> - down(&res_mutex); >>>>> - resp = _resource_lookup(name); >>>>> + resp = resource_lookup(name); >>>>> if (!resp) { >>>>> printk(KERN_ERR "resource_request: Invalid resource >>>> name\n"); >>>>> ret = -EINVAL; >>>>> - goto res_unlock; >>>>> + goto ret; >>>>> } >>>>> >>>>> + down(&resp->resource_mutex); >>>>> /* Call the resource specific validate function */ >>>>> if (resp->ops->validate_level) { >>>>> ret = resp->ops->validate_level(resp, level); @@ >>>>> -361,16 >>>> +362,12 @@ >>>>> int resource_request(const char *name, struct device *dev, >>>>> } >>>>> user->level = level; >>>>> >>>>> + /* Recompute and set the current level for the resource */ >>>>> + ret = update_resource_level(resp); >>>>> res_unlock: >>>>> - up(&res_mutex); >>>>> - /* >>>>> - * Recompute and set the current level for the resource. >>>>> - * NOTE: update_resource level moved out of spin_lock, as it >>>>> may call >>>>> - * pm_qos_add_requirement, which does a kzmalloc. This won't >>>>> be allowed >>>>> - * in iterrupt context. The spin_lock still protects >>>>> add/remove users. >>>>> - */ >>>>> - if (!ret) >>>>> - ret = update_resource_level(resp); >>>>> + up(&resp->resource_mutex); >>>>> + >>>>> +ret: >>>>> return ret; >>>>> } >>>>> EXPORT_SYMBOL(resource_request); >>>>> @@ -393,14 +390,14 @@ int resource_release(const char *name, >>>>> struct device *dev) >>>>> struct users_list *user; >>>>> int found = 0, ret = 0; >>>>> >>>>> - down(&res_mutex); >>>>> - resp = _resource_lookup(name); >>>>> + resp = resource_lookup(name); >>>>> if (!resp) { >>>>> printk(KERN_ERR "resource_release: Invalid resource >>>> name\n"); >>>>> ret = -EINVAL; >>>>> - goto res_unlock; >>>>> + goto ret; >>>>> } >>>>> >>>>> + down(&resp->resource_mutex); >>>>> list_for_each_entry(user, &resp->users_list, node) { >>>>> if (user->dev == dev) { >>>>> found = 1; >>>>> @@ -421,7 +418,9 @@ int resource_release(const char *name, struct >>>>> device >>>>> *dev) >>>>> /* Recompute and set the current level for the resource */ >>>>> ret = update_resource_level(resp); >>>>> res_unlock: >>>>> - up(&res_mutex); >>>>> + up(&resp->resource_mutex); >>>>> + >>>>> +ret: >>>>> return ret; >>>>> } >>>>> EXPORT_SYMBOL(resource_release); >>>>> @@ -438,15 +437,14 @@ int resource_get_level(const char *name) >>>>> struct shared_resource *resp; >>>>> u32 ret; >>>>> >>>>> - down(&res_mutex); >>>>> - resp = _resource_lookup(name); >>>>> + resp = resource_lookup(name); >>>>> if (!resp) { >>>>> printk(KERN_ERR "resource_release: Invalid resource >>>> name\n"); >>>>> - up(&res_mutex); >>>>> return -EINVAL; >>>>> } >>>>> + down(&resp->resource_mutex); >>>>> ret = resp->curr_level; >>>>> - up(&res_mutex); >>>>> + up(&resp->resource_mutex); >>>>> return ret; >>>>> } >>>>> EXPORT_SYMBOL(resource_get_level); >>>>> -- >>>>> 1.5.4.3 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: Kevin Hilman [mailto:khilman@xxxxxxxxxxxxxxxxxxx] >>>>> Sent: Monday, August 10, 2009 11:23 AM >>>>> To: Wang Limei-E12499 >>>>> Cc: linux-omap@xxxxxxxxxxxxxxx >>>>> Subject: Re: Linux-omap PM: Fix dead lock condition in >>>>> resource_release(vdd1_opp) >>>>> >>>>> "Wang Limei-E12499" <E12499@xxxxxxxxxxxx> writes: >>>>> >>>>>> I am using linux-omap pm-2.6.29 >>>>>> <http://git.kernel.org/?p=linux/kernel/git/khilman/linux-omap-pm. >>>>>> git; a =s hortlog;h=pm-2.6.29> branch,found a dead lock >>>>>> condition in: >>>>>> arch/arm/plat-omap/resource.c->resource_release(). >>>>>> >>>>>> The dead lock happens when using >>>>>> resource_request("vdd1_opp",&dev,...) >>>>>> and resource_release("vdd1_opp", &dev) to set and release vdd1 >>>>>> opp constraint. The scenario is: >>>>>> >>>>>> plat-omap/resource.c/resource_release("vdd1_opp", >>>>>> &dev)==>resource.c/update_resource_level()=>mach-omap2/resource34 >>>>>> xx.c / se t_opp(). In set_opp(), if the target_level of vdd1 is >>>>>> less than OPP3,will release the constraint set on VDD2 by calling >>>>>> resource_release(), but it will never return because can not get >>>>>> the mutex which is holding by the previous caller. >>>>>> >>>>>> int resource_release(const char *name, struct device *dev) { >>>>>> ....... >>>>>> down(&res_mutex); >>>>>> ........ >>>>>> /* Recompute and set the current level for the resource */ >>>>>> ret = update_resource_level(resp); >>>>>> res_unlock: >>>>>> up(&res_mutex); >>>>>> return ret; >>>>>> } >>>>>> >>>>>> int set_opp(struct shared_resource *resp, u32 target_level) { >>>>>> ..... >>>>>> if (resp == vdd1_resp) { >>>>>> if (target_level < 3) >>>>>> resource_release("vdd2_opp", &vdd2_dev); } >>>>>> >>>>>> The patch to fix this issue is below, will you pls review it and >>>>>> let me know your feedback? >>>>>> >>>>>> From: Limei Wang <e12499@xxxxxxxxxxxx> >>>>>> Date: Fri, 7 Aug 2009 11:40:35 -0500 >>>>>> Subject: [PATCH] OMAP PM: Fix dead lock bug in >>>>>> resourc_release(vdd1_opp). >>>>>> >>>>>> >>>>>> Signed-off-by: Limei Wang <e12499@xxxxxxxxxxxx> >>>>>> --- >>>>>> arch/arm/plat-omap/resource.c | 6 ++++-- >>>>>> 1 files changed, 4 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/arch/arm/plat-omap/resource.c >>>>>> b/arch/arm/plat-omap/resource.c index ec31727..876fd12 100644 >>>>>> --- a/arch/arm/plat-omap/resource.c >>>>>> +++ b/arch/arm/plat-omap/resource.c >>>>>> @@ -418,10 +418,12 @@ int resource_release(const char *name, >>>>>> struct device *dev) >>>>>> list_del(&user->node); >>>>>> free_user(user); >>>>>> >>>>>> - /* Recompute and set the current level for the resource */ >>>>>> - ret = update_resource_level(resp); >>>>>> res_unlock: >>>>>> up(&res_mutex); >>>>>> + >>>>>> + /* Recompute and set the current level for the resource */ >>>>>> + if (!ret) >>>>>> + ret = update_resource_level(resp); >>>>>> return ret; >>>>>> } >>>>>> EXPORT_SYMBOL(resource_release); >>>>>> -- >>>>>> 1.5.6.3 >>>>> >>>>> This is wrong for several reasons. >>>>> >>>>> First, you're not fixing the problem, you're just moving the call >>>>> outside of the lock, thus creating other locking problems. >>>>> >>>>> Second, the various error paths would break because >>>>> update_resource_level() would be called on the 'res_unlock' error >>>>> path >>>> >>>>> where it is not currently being called. >>>>> >>>>> A per-resource mutex as suggest by Romit seems like the right >>>>> approach >>>> >>>>> to fixing this problem. >>>>> >>>>> Kevin >>> -- >>> To unsubscribe from this list: send the line "unsubscribe >>> linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> > -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html