RE: [PATCH] ACPICA: avoid double free when object already has a zero reference count

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Mark Asselstine <mark.asselstine@xxxxxxxxxxxxx>
> Sent: Monday, November 2, 2020 10:51 AM
> To: Rafael J. Wysocki <rafael@xxxxxxxxxx>; Kaneda, Erik
> <erik.kaneda@xxxxxxxxx>
> Cc: ACPI Devel Maling List <linux-acpi@xxxxxxxxxxxxxxx>; Moore, Robert
> <robert.moore@xxxxxxxxx>
> Subject: Re: [PATCH] ACPICA: avoid double free when object already has a
> zero reference count
> 
> On Monday, November 2, 2020 1:11:22 P.M. EST Kaneda, Erik wrote:
> >
> >
> > > -----Original Message-----
> > > From: Rafael J. Wysocki <rafael@xxxxxxxxxx>
> > > Sent: Thursday, October 29, 2020 7:06 AM
> > > To: Mark Asselstine <mark.asselstine@xxxxxxxxxxxxx>
> > > Cc: ACPI Devel Maling List <linux-acpi@xxxxxxxxxxxxxxx>; Kaneda, Erik
> > > <erik.kaneda@xxxxxxxxx>; Moore, Robert <robert.moore@xxxxxxxxx>
> > > Subject: Re: [PATCH] ACPICA: avoid double free when object already has
> a
> > > zero reference count
> > >
> > >
> > >
> > > +Erik and Bob
> > >
> > >
> > >
> > > On Thu, Oct 29, 2020 at 3:05 AM Mark Asselstine
> > > <mark.asselstine@xxxxxxxxxxxxx> wrote:
> > >
> > > >
> > > >
> > > > The first trip into acpi_ut_update_ref_count() for an object where
> > > > 'object->common.reference_count' is 1 and we are performing a
> > > > REF_DECREMENT will result in 'new_count' being 0 and thus the object
> > > > is deleted via acpi_ut_delete_internal_obj().
> > > >
> > > >
> > > >
> > > > If for some reason we make a subsequent trip into
> > > > acpi_ut_update_ref_count() with the same object,
> > > > object->common.reference_count' will be 0 and performing a
> > > > REF_DECREMENT will produce a warning msg "Reference Count is
> already
> > > > zero, cannot decrement", 'new_count' will again be 0 and the already
> > > > deleted object will be attempted to be deleted again via
> > > > acpi_ut_delete_internal_obj().
> >
> >
> > Mark, Do you have an example of AML/ASL that you used to determine
> this
> > double free?
> 
> Unfortunately no. It is a rare occurance and a consequence of several actions
> taking place at the same time during boot, including a PCI rescan.
> Unfortunately due to circumstances I am sure you would rather not have to
> be
> concerned about, we have so far had to focus our efforts on an older kernel
> which also has the preempt-rt patchset applied. It is for this reason I also
> didn't include the dmesg and eventual kernel BUG_ON in my submission.
> 
> It is unclear at this time if the additional locking or other changes that
> have been merged since the kernel version we are on would prevent this
> from
> occuring on an up to date kernel.
> 
> I have reviewed the code in the latest linux kernel and as far as I can tell
> the deficiency is still present. If you do go into acpi_ut_update_ref_count()
> with an object with a reference count of 0 and an action of REF_DECREMENT
> the
> following may be called:
> 
> acpi_ut_update_ref_count
>   -> acpi_ut_delete_internal_obj
>     -> acpi_ut_delete_object_desc
>       -> acpi_os_release_object
>         -> kmem_cache_free
> 
> I completely understand if you have concerns about the change since I can't
> hand you a reproducer. I was hoping the merrits of the change would stand
> on
> their own as there is no reason to call acpi_ut_delete_internal_obj() if we
> have already done so, even if the rest of the call chain was well behaved. In
> our case the eventual slab corruption ended up affecting the 'Acpi-Operand'
> dedicated cache. We see the corruption happen shortly after we see the msg
> warning that the reference count is already at zero.
> 
> Please let me know if there is anything else I can provide to help out and
> thanks for your time reviewing this change.

Since this is such a sensitive part of the codebase, I would like to see if it's possible to come up with ASL that reproduces this on our userspace interpreter (acpiexec).

I can try reproduce this from your explanations but it would be helpful to also get an acpidump and dmesg so that we can understand which named objects are causing this issue. This gives us a clue on how to reproduce this in user space.

Could you provide an acpidump of the machine and 2 dmesg logs? I need one normal dmesg that shows this error and a custom kernel built with CONFIG_ACPI_DEBUG=y and booting with the following commandline parameters:

acpi.debug_level=0x80000 acpi.debug_layer=0xffffffff

Thanks,
Erik
> 
> Thanks,
> Mark
> 
> 
> > Thanks,
> > Erik
> >
> >
> > > >
> > > >
> > > > Since the object deletion doesn't NULL the object the calls to
> > > > acpi_ut_delete_internal_obj(), acpi_ut_delete_object_desc(),
> > > > acpi_os_release_object(), kmem_cache_free() will operate on the
> object
> > > > as if it hasn't been deleted. In many cases this can result in no
> > > > issues, but if you are using the slab and a new object has been
> > > > created with the same address this can be the cause slab corruption.
> > > >
> > > >
> > > >
> > > > Adding a check if we are decrementing to 0 for the first time and only
> > > > calling acpi_ut_delete_internal_obj() in this case will prevent
> > > > another attempt at deleting the object.
> > > >
> > > >
> > > >
> > > > Signed-off-by: Mark Asselstine <mark.asselstine@xxxxxxxxxxxxx>
> > > > ---
> > > >
> > > >  drivers/acpi/acpica/utdelete.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > >
> > > >
> > > > diff --git a/drivers/acpi/acpica/utdelete.c
> > > > b/drivers/acpi/acpica/utdelete.c
>  index 4c0d4e434196..c6b860fd9eb5
> > > > 100644
> > > > --- a/drivers/acpi/acpica/utdelete.c
> > > > +++ b/drivers/acpi/acpica/utdelete.c
> > > > @@ -421,9 +421,9 @@ acpi_ut_update_ref_count(union
> > >
> > > acpi_operand_object *object, u32 action)
> > >
> > > >                                       ACPI_GET_FUNCTION_NAME, object,
> > > >                                       object->common.type, new_count));
> > > >
> > > >
> > > >
> > > > -               /* Actually delete the object on a reference count of
> > > > zero */
>  +               /* If we haven't already, actually delete the
> > > > object on a reference>
> > > count of zero */
> > >
> > > >
> > > >
> > > > -               if (new_count == 0) {
> > > > +               if (new_count == 0 && original_count != 0) {
> > > >
> > > >                         acpi_ut_delete_internal_obj(object);
> > > >
> > > >                 }
> > > >                 message = "Decrement";
> > > >
> > > > --
> > > > 2.17.1
> > > >
> > > >
> 
> 
> 





[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux