Re: [PATCH] ACPICA: avoid double free when object already has a zero reference count

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tuesday, November 3, 2020 1:52:28 P.M. EST Mark Asselstine wrote:
> On Tuesday, November 3, 2020 1:28:14 P.M. EST Kaneda, Erik wrote:
> > > -----Original Message-----
> > > From: Mark Asselstine <mark.asselstine@xxxxxxxxxxxxx>
> > > Sent: Monday, November 2, 2020 10:51 AM
> > > To: Rafael J. Wysocki <rafael@xxxxxxxxxx>; Kaneda, Erik
> > > <erik.kaneda@xxxxxxxxx>
> > > Cc: ACPI Devel Maling List <linux-acpi@xxxxxxxxxxxxxxx>; Moore, Robert
> > > <robert.moore@xxxxxxxxx>
> > > Subject: Re: [PATCH] ACPICA: avoid double free when object already has a
> > > zero reference count
> > > 
> > > On Monday, November 2, 2020 1:11:22 P.M. EST Kaneda, Erik wrote:
> > > > > -----Original Message-----
> > > > > From: Rafael J. Wysocki <rafael@xxxxxxxxxx>
> > > > > Sent: Thursday, October 29, 2020 7:06 AM
> > > > > To: Mark Asselstine <mark.asselstine@xxxxxxxxxxxxx>
> > > > > Cc: ACPI Devel Maling List <linux-acpi@xxxxxxxxxxxxxxx>; Kaneda,
> > > > > Erik
> > > > > <erik.kaneda@xxxxxxxxx>; Moore, Robert <robert.moore@xxxxxxxxx>
> > > > > Subject: Re: [PATCH] ACPICA: avoid double free when object already
> > > > > has
> > > 
> > > a
> > > 
> > > > > zero reference count
> > > > > 
> > > > > 
> > > > > 
> > > > > +Erik and Bob
> > > > > 
> > > > > 
> > > > > 
> > > > > On Thu, Oct 29, 2020 at 3:05 AM Mark Asselstine
> > > > > 
> > > > > <mark.asselstine@xxxxxxxxxxxxx> wrote:
> > > > > > The first trip into acpi_ut_update_ref_count() for an object where
> > > > > > 'object->common.reference_count' is 1 and we are performing a
> > > > > > REF_DECREMENT will result in 'new_count' being 0 and thus the
> > > > > > object
> > > > > > is deleted via acpi_ut_delete_internal_obj().
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > If for some reason we make a subsequent trip into
> > > > > > acpi_ut_update_ref_count() with the same object,
> > > > > > object->common.reference_count' will be 0 and performing a
> > > > > > REF_DECREMENT will produce a warning msg "Reference Count is
> > > 
> > > already
> > > 
> > > > > > zero, cannot decrement", 'new_count' will again be 0 and the
> > > > > > already
> > > > > > deleted object will be attempted to be deleted again via
> > > > > > acpi_ut_delete_internal_obj().
> > > > 
> > > > Mark, Do you have an example of AML/ASL that you used to determine
> > > 
> > > this
> > > 
> > > > double free?
> > > 
> > > Unfortunately no. It is a rare occurance and a consequence of several
> > > actions taking place at the same time during boot, including a PCI
> > > rescan. Unfortunately due to circumstances I am sure you would rather
> > > not
> > > have to be
> > > concerned about, we have so far had to focus our efforts on an older
> > > kernel
> > > which also has the preempt-rt patchset applied. It is for this reason I
> > > also didn't include the dmesg and eventual kernel BUG_ON in my
> > > submission.
> > > 
> > > It is unclear at this time if the additional locking or other changes
> > > that
> > > have been merged since the kernel version we are on would prevent this
> > > from
> > > occuring on an up to date kernel.
> > > 
> > > I have reviewed the code in the latest linux kernel and as far as I can
> > > tell the deficiency is still present. If you do go into
> > > acpi_ut_update_ref_count() with an object with a reference count of 0
> > > and
> > > an action of REF_DECREMENT the
> > > following may be called:
> > > 
> > > acpi_ut_update_ref_count
> > > 
> > >   -> acpi_ut_delete_internal_obj
> > >   
> > >     -> acpi_ut_delete_object_desc
> > >     
> > >       -> acpi_os_release_object
> > >       
> > >         -> kmem_cache_free
> > > 
> > > I completely understand if you have concerns about the change since I
> > > can't
> > > hand you a reproducer. I was hoping the merrits of the change would
> > > stand
> > > on
> > > their own as there is no reason to call acpi_ut_delete_internal_obj() if
> > > we
> > > have already done so, even if the rest of the call chain was well
> > > behaved.
> > > In our case the eventual slab corruption ended up affecting the
> > > 'Acpi-Operand' dedicated cache. We see the corruption happen shortly
> > > after we see the msg warning that the reference count is already at
> > > zero.
> > > 
> > > Please let me know if there is anything else I can provide to help out
> > > and
> > > thanks for your time reviewing this change.
> > 
> > Since this is such a sensitive part of the codebase, I would like to see
> > if
> > it's possible to come up with ASL that reproduces this on our userspace
> > interpreter (acpiexec).
> > 
> > I can try reproduce this from your explanations but it would be helpful to
> > also get an acpidump and dmesg so that we can understand which named
> > objects are causing this issue. This gives us a clue on how to reproduce
> > this in user space.
> > 
> > Could you provide an acpidump of the machine and 2 dmesg logs? I need one
> > normal dmesg that shows this error and a custom kernel built with
> > CONFIG_ACPI_DEBUG=y and booting with the following commandline parameters:
> > 
> > acpi.debug_level=0x80000 acpi.debug_layer=0xffffffff
> 
> This should be possible. Give me a day or two to put something together.
> Again my intention, as when dealing with any maintainer, is to be sensitive
> of your time and given the kernel version skew and such I am not sure how
> well it will map to the latest kernel code. But if you can pull out some
> clues to help with this mapping I am not going to impede your offer of
> assistance.

I have supplied what I could directly to the Intel folks. Unfortunately, as I 
had stated previously our reproduction case uses an older kernel so can't 
easily map to the current state of code in mainline.

I go back to my original assertion that the same condition which is being 
caught by the warning should be used to prevent a second attempt at deleting 
an already deleted object. Attempting a second deletion has been observed to 
be troublesome as the underlying memory could now be in used by a valid object 
and slab corruption is highly likely.

I have also supplied a github pull request to the acpica repo as this issue 
could also manifest itself negatively in other OSes.

https://github.com/acpica/acpica/pull/652

Mark

> 
> Thanks,
> MarkA
> 
> > Thanks,
> > Erik
> > 
> > > Thanks,
> > > Mark
> > > 
> > > > Thanks,
> > > > Erik
> > > > 
> > > > > > Since the object deletion doesn't NULL the object the calls to
> > > > > > acpi_ut_delete_internal_obj(), acpi_ut_delete_object_desc(),
> > > > > > acpi_os_release_object(), kmem_cache_free() will operate on the
> > > 
> > > object
> > > 
> > > > > > as if it hasn't been deleted. In many cases this can result in no
> > > > > > issues, but if you are using the slab and a new object has been
> > > > > > created with the same address this can be the cause slab
> > > > > > corruption.
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Adding a check if we are decrementing to 0 for the first time and
> > > > > > only
> > > > > > calling acpi_ut_delete_internal_obj() in this case will prevent
> > > > > > another attempt at deleting the object.
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Signed-off-by: Mark Asselstine <mark.asselstine@xxxxxxxxxxxxx>
> > > > > > ---
> > > > > > 
> > > > > >  drivers/acpi/acpica/utdelete.c | 4 ++--
> > > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/acpi/acpica/utdelete.c
> > > > > > b/drivers/acpi/acpica/utdelete.c
> > >  
> > >  index 4c0d4e434196..c6b860fd9eb5
> > >  
> > > > > > 100644
> > > > > > --- a/drivers/acpi/acpica/utdelete.c
> > > > > > +++ b/drivers/acpi/acpica/utdelete.c
> > > > > > @@ -421,9 +421,9 @@ acpi_ut_update_ref_count(union
> > > > > 
> > > > > acpi_operand_object *object, u32 action)
> > > > > 
> > > > > >                                       ACPI_GET_FUNCTION_NAME,
> > > > > >                                       object,
> > > > > >                                       object->common.type,
> > > > > >                                       new_count));
> > > > > > 
> > > > > > -               /* Actually delete the object on a reference count
> > > > > > of
> > > > > > zero */
> > >  
> > >  +               /* If we haven't already, actually delete the
> > >  
> > > > > > object on a reference>
> > > > > 
> > > > > count of zero */
> > > > > 
> > > > > > -               if (new_count == 0) {
> > > > > > +               if (new_count == 0 && original_count != 0) {
> > > > > > 
> > > > > >                         acpi_ut_delete_internal_obj(object);
> > > > > >                 
> > > > > >                 }
> > > > > >                 message = "Decrement";
> > > > > > 
> > > > > > --
> > > > > > 2.17.1







[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux