RE: [PATCH] ACPICA: Add mutex to avoid race condition of reference count of notify object

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

> From: linux-acpi-owner@xxxxxxxxxxxxxxx [mailto:linux-acpi-owner@xxxxxxxxxxxxxxx] On Behalf Of joeyli
> Sent: Monday, June 16, 2014 5:14 PM
> 
> Hi Lv Zheng,
> 
> Thanks for your response.
> 
> On Mon, Jun 16, 2014 at 05:10:29AM +0000, Zheng, Lv wrote:
> > Hi,
> >
> > > From: Lee, Chun-Yi [mailto:joeyli.kernel@xxxxxxxxx]
> > > Sent: Monday, June 16, 2014 10:12 AM
> > >
> > > This issue found on v3.0 kernel, unfortunately there was no chance
> > > to test latest kernel on issue mchine. This patch tested on v3.0 kernel
> > > then sent to linux-acpi for review and note, maybe latest kernel also need.
> > >
> > > The problem happened when acpi thermal driver evaluate _PSL, but acpi
> > > processor driver install notify handler at the same time. In the
> > > code path of evaluate _PSL, it updates reference count of processor
> > > and its notify objects. When the notify handler installation done
> > > after the reference count of processor updated, it causes the
> > > ref_count of processor doesn't sync with its notify object's
> > > ref_count value.
> > >
> > > Here is an debugging log when issue reproduced:
> > >
> > > [    3.481773] ACPI_TYPE_PROCESSOR set ACPI_DEVICE_NOTIFY, object_desc->common.reference_count: 3, notify_obj-
> > > >common.reference_count: 1
> > > [    3.481958] PROCESSOR device_hid: LNXCPU
> > > ...
> > > [    3.487427] ACPI_TYPE_PROCESSOR, action = 1
> > > [    3.487428] Update device_notify ref_count
> > > [    3.487429] REF_DECREMENT ACPI_TYPE_LOCAL_NOTIFY original_count: 0
> > > [    3.487431] ACPI Warning: Obj ffff8800b0f40b28, Reference Count is already zero, cannot decrement
> > > [    3.487433]  (20110413/utdelete-431)
> > > [    3.487434] REF_DECREMENT ACPI_TYPE_PROCESSOR original_count: 2
> > >
> > > Accroding log, found the reference_count of parent object
> > > (it's processor in this case) is 3, it doesn't match with notify_object's
> > > reference_count, value is 1. It triggered "Reference Count is already zero"
> > > warning, then happen object double free issue later.
> > >
> > > To avoid rece condition, this patch introded ACPI_MTX_NOTIFY_REF_COUNT
> > > mutex to keep the ref_count of notify object sync with its parent
> > > object. And, it also set the reference_count value of new notify object
> > > equals to its parent object's reference_count.
> > >
> > > Signed-off-by: Lee, Chun-Yi <jlee@xxxxxxxx>
> > > ---
> > >  drivers/acpi/acpica/aclocal.h  | 3 ++-
> > >  drivers/acpi/acpica/evxface.c  | 7 ++++++-
> > >  drivers/acpi/acpica/utdelete.c | 6 ++++++
> > >  3 files changed, 14 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/acpi/acpica/aclocal.h b/drivers/acpi/acpica/aclocal.h
> > > index c7f743c..e25a4af 100644
> > > --- a/drivers/acpi/acpica/aclocal.h
> > > +++ b/drivers/acpi/acpica/aclocal.h
> > > @@ -85,8 +85,9 @@ union acpi_parse_object;
> > >  #define ACPI_MTX_MEMORY                 5	/* Debug memory tracking lists */
> > >  #define ACPI_MTX_DEBUG_CMD_COMPLETE     6	/* AML debugger */
> > >  #define ACPI_MTX_DEBUG_CMD_READY        7	/* AML debugger */
> > > +#define ACPI_MTX_NOTIFY_REF_COUNT	8	/* Reference count of notify object */
> > >
> > > -#define ACPI_MAX_MUTEX                  7
> > > +#define ACPI_MAX_MUTEX                  8
> > >  #define ACPI_NUM_MUTEX                  ACPI_MAX_MUTEX+1
> > >
> > >  /* Lock structure for reader/writer interfaces */
> > > diff --git a/drivers/acpi/acpica/evxface.c b/drivers/acpi/acpica/evxface.c
> > > index e114140..213fe1a 100644
> > > --- a/drivers/acpi/acpica/evxface.c
> > > +++ b/drivers/acpi/acpica/evxface.c
> > > @@ -495,6 +495,10 @@ acpi_install_notify_handler(acpi_handle device,
> > >  						handler, context,
> > >  						NULL);
> > >
> > > +		acpi_ut_acquire_mutex(ACPI_MTX_NOTIFY_REF_COUNT);
> > > +
> > > +		notify_obj->common.reference_count = obj_desc->common.reference_count;
> > > +
> >
> > Should be converted to spin_lock here.
> >
> 
> Did you mean using acpi_gbl_reference_count_lock ?

Yes.

> 
> > >  		if (handler_type & ACPI_SYSTEM_NOTIFY) {
> > >  			obj_desc->common_notify.system_notify = notify_obj;
> > >  		}
> > > @@ -503,8 +507,9 @@ acpi_install_notify_handler(acpi_handle device,
> > >  			obj_desc->common_notify.device_notify = notify_obj;
> > >  		}
> > >
> > > -		if (handler_type == ACPI_ALL_NOTIFY) {
> > > +		acpi_ut_release_mutex(ACPI_MTX_NOTIFY_REF_COUNT);
> > >
> > > +		if (handler_type == ACPI_ALL_NOTIFY) {
> > >  			/* Extra ref if installed in both */
> > >
> > >  			acpi_ut_add_reference(notify_obj);
> > > diff --git a/drivers/acpi/acpica/utdelete.c b/drivers/acpi/acpica/utdelete.c
> > > index 31f5a78..7559813 100644
> > > --- a/drivers/acpi/acpica/utdelete.c
> > > +++ b/drivers/acpi/acpica/utdelete.c
> > > @@ -504,6 +504,7 @@ acpi_ut_update_object_reference(union acpi_operand_object *object, u16 action)
> > >
> > >  			/* Update the notify objects for these types (if present) */
> > >
> > > +			acpi_ut_acquire_mutex(ACPI_MTX_NOTIFY_REF_COUNT);
> 
> Can I acquire acpi_gbl_reference_count_lock here? Does is not cause recursive lock?
> Or direct acquire remove mutex here?

You can.
If you just want to hold the lock longer, you can introduce a lockless API to replace the one invoked below.

> 
> > >  			acpi_ut_update_ref_count(object->common_notify.
> > >  						 system_notify, action);
> >
> > If you take a look at acpi_ut_update_ref_count, there is already a spin_lock held around the code to update the reference count.
> >
> > Thanks
> > -Lv
> 
> hm.... I am not sure can using acpi_gbl_reference_count_lock.
> 
> When updating parent object's ref_count, it should atomic with its notify objects's ref_count updating.
> The situation I want to avoid is:
> 
>  a. acpi_ut_update_object_reference() update ref_count of system_notify/device_notify. (Thermal driver)
>     Assume the notify object didn't install to parent object yet. So, the ref_count doesn't increase.
> 
>  b. acpi_install_notify_handler() install system_notify/device_notify. (Processor driver)
>     Processor driver install notify handler of parent object in this window. The ref_count of notify object is 1.
> 
>  c. acpi_ut_update_object_reference()  call acpi_ut_update_ref_count() to update ref_count of parent object. (Thermal driver)
>     The ref_count of parent object update to 2.
>     		<=== this ref_count doesn't match with ref_count of notify object, it's 1.
> 
> Due to a. and c. steps should atomic, so I add ACPI_MTX_NOTIFY_REF_COUNT to bind them together, when
> the object type is processor, device, power or thermal.

I can see that the only problem is the sub-object's reference count updates.
It shouldn't be increased each time the parent object's reference count is increased, but should be increased only once when it is linked to the parent object (referenced by the parent).
That seems to be the root cause.
The current code might be a hackish result to work around some design defects.
So what you do in this patch might be dangerous, they can break ACPICA interpreter and you need to do it in the ACPICA source base.
There is a recursive unit test facility - ASLTS to help determine if your change doesn't introduce regressions.
I think you need to file a bug in ACPICA bugzilla and start a discussion there.

Thanks and best regards
-Lv

> 
> >
> > >  			acpi_ut_update_ref_count(object->common_notify.
> > > @@ -592,6 +593,11 @@ acpi_ut_update_object_reference(union acpi_operand_object *object, u16 action)
> > >  		 * main object to be deleted.
> > >  		 */
> > >  		acpi_ut_update_ref_count(object, action);
> > > +		if (object->common.type == ACPI_TYPE_PROCESSOR ||
> > > +		    object->common.type == ACPI_TYPE_DEVICE ||
> > > +		    object->common.type == ACPI_TYPE_POWER ||
> > > +		    object->common.type == ACPI_TYPE_THERMAL)
> > > +			acpi_ut_release_mutex(ACPI_MTX_NOTIFY_REF_COUNT);
> 
> here also. I think need remove this release?
> 
> > >  		object = NULL;
> > >
> > >  		/* Move on to the next object to be updated */
> > > --
> > > 1.8.4.5
> >
> 
> Please let me know if I miss understood.
> 
> 
> Thanks a lot!
> Joey Lee
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux