[PATCH] dmr/amdgpu: Fix wrongly unref of BO

felix.kuehling@xxxxxxx (Felix Kuehling) · Fri, 21 Apr 2017 11:43:22 -0400

On 17-04-21 03:11 AM, Christian KÃ¶nig wrote:
> Hi Alex,
>
>> No. For the current source code, I think the premap and no-op is not
>> working.
>>
> Indeed, we don't set bo->mem.bus.addr in amdgpu_ttm_io_mem_reserve()
> any more. Felix will probably want to fix that for the KFD branch.

I vaguely remember discussing this in the past: using mem->bus.addr to
keep memory permanently CPU-mapped and avoid redundant ioremap calls. As
I remember it, we weren't actually using this. It's only something we
considered at one point.

Regards,
  Felix

>
> Anyway, as I said not unmapping the page tables is harmless compared
> to not releasing the memory backing it.
>
> So please just do as I told you and change the interruptible
> reservation to a non-interruptible.
>
> Regards,
> Christian.
>
> Am 20.04.2017 um 23:56 schrieb Xie, AlexBin:
>>
>> Hi Christian,
>>
>>
>> I read amdgpu_ttm_io_mem_reserve() and amdgpu_ttm_io_mem_free() and
>> relevant codes from amdgpu_vram_scratch_init
>> and amdgpu_vram_scratch_fini. 
>>
>>
>> No. For the current source code, I think the premap and no-op is not
>> working.
>>
>>
>> I add printk to prove. You can see bo_kmap_type is an invalid
>> value. ioremap_wc is really called to map the IO range into vmalloc
>> space.
>>
>>
>> ...
>>
>> Apr 20 16:31:18 axie-System-Product-Name kernel: [  106.759623]
>> entering amdgpu_vram_scratch_init.
>> Apr 20 16:31:18 axie-System-Product-Name kernel: [  106.759631]
>> scratch ioremap_wc
>> Apr 20 16:31:18 axie-System-Product-Name kernel: [  106.759631]
>> bo_kmap_type = 129
>> Apr 20 16:31:18 axie-System-Product-Name kernel: [  106.759632] bus
>> address =           (null)
>> Apr 20 16:31:18 axie-System-Product-Name kernel: [  106.759632]
>> is_iomem = 1
>> Apr 20 16:31:18 axie-System-Product-Name kernel: [  106.759633]
>> leaving amdgpu_vram_scratch_init.
>> ...
>>
>> I don't have log on rmmod AMDGPU yet. There are quite some settings
>> to make that happen in my computer.
>> I think they are symemtric. Both mapping and unmapping are not no-op.
>>
>> Here is the printk patch for your reference.
>>
>> From 25f95239c23225008e4b59f30b9b5363fb321f94 Mon Sep 17 00:00:00 2001
>> From: Alex Xie <AlexBin.Xie at amd.com>
>> Date: Thu, 20 Apr 2017 17:48:49 -0400
>> Subject: [PATCH] A hack to trace premap and noop.
>>
>> Change-Id: I61fbbdbd82f62433e229b2e7e97be7a780ea8afa
>> Signed-off-by: Alex Xie <AlexBin.Xie at amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 22 ++++++++++++++++++++++
>>  drivers/gpu/drm/ttm/ttm_bo.c               |  1 +
>>  drivers/gpu/drm/ttm/ttm_bo_util.c          | 29
>> ++++++++++++++++++++++++++---
>>  include/drm/ttm/ttm_bo_api.h               |  1 +
>>  4 files changed, 50 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index fbb4afb..a537ea1 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -313,6 +313,7 @@ static void amdgpu_block_invalid_wreg(struct
>> amdgpu_device *adev,
>>  static int amdgpu_vram_scratch_init(struct amdgpu_device *adev)
>>  {
>>  int r;
>> +printk("entering amdgpu_vram_scratch_init.");
>>  
>>  if (adev->vram_scratch.robj == NULL) {
>>  r = amdgpu_bo_create(adev, AMDGPU_GPU_PAGE_SIZE,
>> @@ -340,16 +341,36 @@ static int amdgpu_vram_scratch_init(struct
>> amdgpu_device *adev)
>>  amdgpu_bo_unpin(adev->vram_scratch.robj);
>>  amdgpu_bo_unreserve(adev->vram_scratch.robj);
>>  
>> +/* Would like a printk to see if map / unmap is noop*/
>> +adev->vram_scratch.robj->tbo.mem.bus.printk = true;
>> +
>> +if (adev->vram_scratch.robj->kmap.bo_kmap_type == ttm_bo_map_premapped)
>> +printk("amdgpu_vram_scratch premapped!");
>> +
>> +printk("bo_kmap_type = %d", adev->vram_scratch.robj->kmap.bo_kmap_type);
>> +printk("bus address = %p", adev->vram_scratch.robj->tbo.mem.bus.addr);
>> +printk("is_iomem = %d", adev->vram_scratch.robj->tbo.mem.bus.is_iomem);
>> +printk("leaving amdgpu_vram_scratch_init.");
>> +
>>  return r;
>>  }
>>  
>>  static void amdgpu_vram_scratch_fini(struct amdgpu_device *adev)
>>  {
>>  int r;
>> +printk("entering amdgpu_vram_scratch_fini.");
>>  
>>  if (adev->vram_scratch.robj == NULL) {
>>  return;
>>  }
>> +
>> +if (adev->vram_scratch.robj->kmap.bo_kmap_type == ttm_bo_map_premapped)
>> +printk("amdgpu_vram_scratch premapped!");
>> +
>> +printk("bo_kmap_type = %d", adev->vram_scratch.robj->kmap.bo_kmap_type);
>> +printk("bus address = %p", adev->vram_scratch.robj->tbo.mem.bus.addr);
>> +printk("is_iomem = %d", adev->vram_scratch.robj->tbo.mem.bus.is_iomem);
>> +
>>  r = amdgpu_bo_reserve(adev->vram_scratch.robj, false);
>>  if (likely(r == 0)) {
>>  amdgpu_bo_kunmap(adev->vram_scratch.robj);
>> @@ -357,6 +378,7 @@ static void amdgpu_vram_scratch_fini(struct
>> amdgpu_device *adev)
>>  amdgpu_bo_unreserve(adev->vram_scratch.robj);
>>  }
>>  amdgpu_bo_unref(&adev->vram_scratch.robj);
>> +printk("leaving amdgpu_vram_scratch_fini.");
>>  }
>>  
>>  /**
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index 989b98b..9b05d54 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -1178,6 +1178,7 @@ int ttm_bo_init_reserved(struct ttm_bo_device
>> *bdev,
>>  bo->mem.page_alignment = page_alignment;
>>  bo->mem.bus.io_reserved_vm = false;
>>  bo->mem.bus.io_reserved_count = 0;
>> +bo->mem.bus.printk = false;
>>  bo->moving = NULL;
>>  bo->mem.placement = (TTM_PL_FLAG_SYSTEM | TTM_PL_FLAG_CACHED);
>>  bo->persistent_swap_storage = persistent_swap_storage;
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c
>> b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> index bf6e216..9d06952 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
>> @@ -526,14 +526,24 @@ static int ttm_bo_ioremap(struct
>> ttm_buffer_object *bo,
>>  if (bo->mem.bus.addr) {
>>  map->bo_kmap_type = ttm_bo_map_premapped;
>>  map->virtual = (void *)(((u8 *)bo->mem.bus.addr) + offset);
>> +if (bo->mem.bus.printk)
>> +printk ("scratch premapping");
>> +
>>  } else {
>>  map->bo_kmap_type = ttm_bo_map_iomap;
>> -if (mem->placement & TTM_PL_FLAG_WC)
>> +if (mem->placement & TTM_PL_FLAG_WC) {
>>  map->virtual = ioremap_wc(bo->mem.bus.base + bo->mem.bus.offset +
>> offset,
>>   size);
>> -else
>> +if (bo->mem.bus.printk)
>> +printk ("scratch ioremap_wc");
>> +
>> +}
>> +else {
>>  map->virtual = ioremap_nocache(bo->mem.bus.base + bo->mem.bus.offset
>> + offset,
>>        size);
>> +if (bo->mem.bus.printk)
>> +printk ("scratch ioremap_nocache");
>> +}
>>  }
>>  return (!map->virtual) ? -ENOMEM : 0;
>>  }
>> @@ -618,21 +628,34 @@ void ttm_bo_kunmap(struct ttm_bo_kmap_obj *map)
>>  struct ttm_mem_type_manager *man =
>>  &bo->bdev->man[bo->mem.mem_type];
>>  
>> -if (!map->virtual)
>> +if (!map->virtual) {
>> +if (bo->mem.bus.printk)
>> +printk ("scratch unmap return earlier");
>>  return;
>> +}
>>  switch (map->bo_kmap_type) {
>>  case ttm_bo_map_iomap:
>> +if (bo->mem.bus.printk)
>> +printk ("scratch iounmap");
>>  iounmap(map->virtual);
>>  break;
>>  case ttm_bo_map_vmap:
>> +if (bo->mem.bus.printk)
>> +printk ("scratch vunmap");
>>  vunmap(map->virtual);
>>  break;
>>  case ttm_bo_map_kmap:
>> +if (bo->mem.bus.printk)
>> +printk ("scratch kunmap");
>>  kunmap(map->page);
>>  break;
>>  case ttm_bo_map_premapped:
>> +if (bo->mem.bus.printk)
>> +printk ("scratch unmap ttm_bo_map_premapped");
>>  break;
>>  default:
>> +if (bo->mem.bus.printk)
>> +printk ("scratch unmap bug");
>>  BUG();
>>  }
>>  (void) ttm_mem_io_lock(man, false);
>> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
>> index 2d0f63e..f74a554 100644
>> --- a/include/drm/ttm/ttm_bo_api.h
>> +++ b/include/drm/ttm/ttm_bo_api.h
>> @@ -70,6 +70,7 @@ struct ttm_bus_placement {
>>  boolis_iomem;
>>  boolio_reserved_vm;
>>  uint64_t        io_reserved_count;
>> +bool            printk;
>>  };
>>  
>>  
>> -- 
>> 2.7.4
>>
>>
>>
>>
>> Thanks,
>>
>> Alex Bin
>>
>> ------------------------------------------------------------------------
>> *From:* Christian KÃ¶nig <deathsimple at vodafone.de>
>> *Sent:* Thursday, April 20, 2017 4:43 AM
>> *To:* Xie, AlexBin; Zhou, David(ChunMing); amd-gfx at lists.freedesktop.org
>> *Subject:* Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO
>>  
>> Hi AlexBin,
>>
>>> Missing kunmap mapping in vmalloc will make kernel master page table
>>> incorrect.
>> That's what I tried to explain yesterday, but unfortunately didn't
>> had time to do so. There is not corruption of the kernel master page
>> table in this case!
>>
>> The call of ttm_bo_kunmap is completely optional, take a look at
>> amdgpu_ttm_io_mem_reserve() and amdgpu_ttm_io_mem_free().
>>
>> The aperture is kept mapped into the page tables for the whole time
>> the driver is loaded. So this is a complete no-op and only done for
>> consistency.
>>
>>> It is good that you agree that there is no real world bad example
>>> caused by my patch. I will not discuss whether it is an improvement
>>> or not now to save time for both of us.
>>>
>> Great at least we can now agree to completely drop this patch.
>>
>> Thanks,
>> Christian.
>>
>> Am 19.04.2017 um 21:30 schrieb Xie, AlexBin:
>>>
>>> Hi Christian,
>>>
>>>
>>> Missing kunmap mapping in vmalloc will make kernel master page table
>>> incorrect. I would not call such issue as completely harmless.
>>> Please note that AMD graphic driver can run in 32 bit system. In 32
>>> bit system, vmalloc address space is much smaller than size of most
>>> GPU VRAM.
>>>
>>>
>>> amdgpu_bo_free_kernel has same issue as amdgpu_vram_scratch_fini. 1.
>>> It calls amdgpu_bo_reserve interruptible too. 2. It misses kunmap
>>> when amdgpu_bo_reserve returns error too. As result, kernel master
>>> page table can become incorrect, or as you call it "completely
>>> harmless vmalloc space leaking". 
>>>
>>>
>>> Because amdgpu_bo_free_kernel is used in more places, such as psp
>>> command submission, there will be bigger chance to have other usage
>>> where signal is not blocked. This will become a real bug.
>>>
>>>
>>> I am thinking that we may fix the issue completely when TTM releases
>>> BO. But that is a bigger change.
>>>
>>>
>>> It is good that you agree that there is no real world bad example
>>> caused by my patch. I will not discuss whether it is an improvement
>>> or not now to save time for both of us.
>>>
>>>
>>> Thanks,
>>>
>>> Alex Bin Xie
>>>
>>>
>>> ------------------------------------------------------------------------
>>> *From:* Christian KÃ¶nig <deathsimple at vodafone.de>
>>> *Sent:* Wednesday, April 19, 2017 7:50 AM
>>> *To:* Xie, AlexBin; Zhou, David(ChunMing); amd-gfx at lists.freedesktop.org
>>> *Subject:* Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO
>>>  
>>>> Without correctly kunmap, page table is corrupted. Page entries
>>>> point to wrong memory locations. You might call it completely
>>>> harmless. But I think this is a severe bug. Leaking memory is
>>>> better than a corrupted page table. Think security.
>>> We are talking about the page tables for the vmalloc area in the
>>> kernel here, so no security problem. Leaking memory is much more
>>> problematic.
>>>
>>>> Would you provide any document and reference by saying" It is
>>>> impossible to receive a signal during module load/unload"? For
>>>> example, if the unload stuck in a lock, can CTRL+C stop the unload?
>>>>
>>> No, CTRL+C doesn't abort module load/unload. There have been patches
>>> to changes this a while ago, but IIRC it broke a whole bunch of
>>> driver relying on this.
>>>
>>>> What about there is some other return error? What about in future
>>>> somebody improve amdgpu_bo_reserve to return other errors,
>>>> then function amdgpu_vram_scratch_fini becomes buggy?
>>>>
>>> Yes, that is indeed an issue. For example -EDEADLK is possible as
>>> well. That's why I said we should use amdgpu_bo_free_kernel() instead.
>>>
>>>> While I am thinking whether there is a better way for the current
>>>> situation, would you give a real world example that my patch really
>>>> not working? Then we can address it.
>>>>
>>> I don't think there is because the driver can't receive a signal
>>> during load/unload, but the problem is rather that the patch doesn't
>>> improve the situation at all.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 19.04.2017 um 13:37 schrieb Xie, AlexBin:
>>>>
>>>> Hi Christian,
>>>>
>>>>
>>>> Without correctly kunmap, page table is corrupted. Page entries
>>>> point to wrong memory locations. You might call it completely
>>>> harmless. But I think this is a severe bug. Leaking memory is
>>>> better than a corrupted page table. Think security.
>>>>
>>>>
>>>> Would you provide any document and reference by saying" It is
>>>> impossible to receive a signal during module load/unload"? For
>>>> example, if the unload stuck in a lock, can CTRL+C stop the unload?
>>>>
>>>>
>>>> If "It is impossible to receive a signal during module
>>>> load/unload", interruptible waiting is fine too, because
>>>> function amdgpu_bo_reserve will return successfully.
>>>>
>>>>
>>>> What about there is some other return error? What about in future
>>>> somebody improve amdgpu_bo_reserve to return other errors,
>>>> then function amdgpu_vram_scratch_fini becomes buggy?
>>>>
>>>>
>>>> While I am thinking whether there is a better way for the current
>>>> situation, would you give a real world example that my patch really
>>>> not working? Then we can address it.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Alex Bin
>>>>
>>>>
>>>> ------------------------------------------------------------------------
>>>> *From:* Christian KÃ¶nig <deathsimple at vodafone.de>
>>>> *Sent:* Wednesday, April 19, 2017 2:35 AM
>>>> *To:* Xie, AlexBin; Zhou, David(ChunMing);
>>>> amd-gfx at lists.freedesktop.org
>>>> *Subject:* Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO
>>>>  
>>>> Hi AlexBin,
>>>>
>>>> the answer is ttm_bo_kunmap isn't called at all and yes in the case
>>>> of an iomap we leak the address space reserved.
>>>>
>>>> But that is completely harmless on a 64bit system compared to
>>>> leaking the memory backing the address space.
>>>>
>>>> Using amdgpu_bo_free_kernel() instead of openly coding it here is
>>>> probably a good idea.
>>>>
>>>> Additional to that it's probably a good idea to set the no_intr
>>>> flag when reserving kernel BOs. It is impossible to receive a
>>>> signal during module load/unload, but it's probably better to
>>>> document that in the code as well.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 18.04.2017 um 20:54 schrieb Xie, AlexBin:
>>>>> Hi Christian,
>>>>>
>>>>> Have you found how/where/when? When you said "mapping will just be
>>>>> released a bit later on", you must know the answer. 
>>>>>
>>>>> It is difficult to prove something does not exist. Anyway, I will
>>>>> give it a try to prove such "later on" does not exist.
>>>>>
>>>>> Function ttm_bo_kunmap is the only function to unmap. To prove
>>>>> this, search ttm_bo_map_iomap, only ttm_bo_kunmap use this enum to
>>>>> correctly kunmap.
>>>>>
>>>>> Function ttm_bo_kunmap is not called by ttm itself. This is a hint
>>>>> that all TTM delay delete mechanism or unref mechanism will NOT
>>>>> kunmap BO later on.
>>>>>
>>>>> Function ttm_bo_kunmap is called by AMDGPU
>>>>> function amdgpu_bo_kunmap and amdgpu_gem_prime_vunmap.
>>>>>
>>>>> Search AMDGPU for amdgpu_bo_kunmap. All matches do not kunmap for
>>>>> scratch VRAM BO.  amdgpu_bo_free_kernel is a suspect but the
>>>>> answer is still NO.
>>>>>
>>>>> So all possibilities are searched. Did I miss anything?
>>>>>
>>>>> Thanks,
>>>>> Alex Bin Xie
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>> *From:* Xie, AlexBin
>>>>> *Sent:* Tuesday, April 18, 2017 2:04:33 PM
>>>>> *To:* Christian KÃ¶nig; Zhou, David(ChunMing);
>>>>> amd-gfx at lists.freedesktop.org
>>>>> *Subject:* Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO
>>>>>  
>>>>>
>>>>> Hi Christian,
>>>>>
>>>>>
>>>>> Would you point out where/when will kunmap happen for this BO when
>>>>> release? It must be somewhere in some function calls.
>>>>>
>>>>>
>>>>> I checked before I asked for review. But I did not see such
>>>>> obvious kunmap function call.
>>>>>
>>>>>
>>>>> If so, there should be a comment in
>>>>> function amdgpu_vram_scratch_fini to avoid future confusion.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Alex Bin Xie
>>>>> ------------------------------------------------------------------------
>>>>> *From:* Christian KÃ¶nig <deathsimple at vodafone.de>
>>>>> *Sent:* Tuesday, April 18, 2017 1:46 PM
>>>>> *To:* Xie, AlexBin; Zhou, David(ChunMing);
>>>>> amd-gfx at lists.freedesktop.org
>>>>> *Subject:* Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO
>>>>>  
>>>>> Hi AlexBin,
>>>>>
>>>>> No, David is right. This is a very common coding pattern in the
>>>>> kernel module.
>>>>>
>>>>> Freeing up a BO while there still exists a kernel mapping is
>>>>> perfectly ok, the mapping will just be released a bit later on.
>>>>>
>>>>> So this code is actually perfectly ok and just an optimization,
>>>>> but your patch breaks it and creates a memory leak.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>> Am 18.04.2017 um 17:17 schrieb Xie, AlexBin:
>>>>>>
>>>>>> Hi David,
>>>>>>
>>>>>>
>>>>>> When amdgpu_bo_reserve return errors, we cannot release the BO.
>>>>>> This is not a memory leak. General speaking, memory leak
>>>>>> is unnoticed and unintentional.
>>>>>>
>>>>>>
>>>>>> The caller of function amdgpu_vram_scratch_fini ignores the
>>>>>> return error value...
>>>>>>
>>>>>>
>>>>>> The "memory leak" is not caused by my patch. It is caused because
>>>>>> reserving BO fails.
>>>>>>
>>>>>>
>>>>>> This patch only aim to make function amdgpu_vram_scratch_fini
>>>>>> behave correctly.
>>>>>>
>>>>>>
>>>>>> To follow up, we can add a warning message when amdgpu_bo_reserve
>>>>>> error happens in a different patch.
>>>>>>
>>>>>>
>>>>>> If function call amdgpu_bo_reserve is changed to uninterruptible,
>>>>>> this changes driver behaviour. Without a substantial issue, I
>>>>>> would be cautious for such a change.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Alex Bin Xie
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------
>>>>>> *From:* Zhou, David(ChunMing)
>>>>>> *Sent:* Monday, April 17, 2017 10:38 PM
>>>>>> *To:* Xie, AlexBin; amd-gfx at lists.freedesktop.org
>>>>>> *Subject:* Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO
>>>>>>  
>>>>>>
>>>>>>
>>>>>> On 2017å¹´04æ??17æ?¥ 22:54, Xie, AlexBin wrote:
>>>>>>>
>>>>>>> Hi David,
>>>>>>>
>>>>>>>
>>>>>>> Thanks for the comments. However, please have look at
>>>>>>> amdgpu_bo_reserve definition.
>>>>>>>
>>>>>>> static inline int amdgpu_bo_reserve(struct amdgpu_bo *bo, bool
>>>>>>> no_intr)
>>>>>>>
>>>>>> Ah, this is a wired wrapper for ttm_bo_reserve.
>>>>>>
>>>>>>>
>>>>>>> When we call this function like the following:
>>>>>>>
>>>>>>>          r = amdgpu_bo_reserve(adev->vram_scratch.robj, false);
>>>>>>> The false means interruptible.
>>>>>>>
>>>>>>>
>>>>>>> On the other hand,  when amdgpu_bo_reserve function return
>>>>>>> error, why do we unref BO without kunmap and unpin the BO? This
>>>>>>> is wrong implementation when amdgpu_bo_reserve return any error.
>>>>>>>
>>>>>> Yeah, I see your mean, it's in driver un-loading, How about
>>>>>> changing to no interruptible? Your patch will make a memleak if
>>>>>> bo_reserve fails, but it seems not matter. I have no strong
>>>>>> preference.
>>>>>>
>>>>>> Regards,
>>>>>> David Zhou
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Alex Bin Xie
>>>>>>>
>>>>>>> ------------------------------------------------------------------------
>>>>>>> *From:* Zhou, David(ChunMing)
>>>>>>> *Sent:* Friday, April 14, 2017 12:00 AM
>>>>>>> *To:* Xie, AlexBin; amd-gfx at lists.freedesktop.org
>>>>>>> *Subject:* Re: [PATCH] dmr/amdgpu: Fix wrongly unref of BO
>>>>>>>  
>>>>>>>
>>>>>>>
>>>>>>> On 2017å¹´04æ??14æ?¥ 05:34, Alex Xie wrote:
>>>>>>> > According to comment of amdgpu_bo_reserve, amdgpu_bo_reserve
>>>>>>> > can return with -ERESTARTSYS. When this function was interrupted
>>>>>>> > by a signal, BO should not be unref. Otherwise the BO might be
>>>>>>> > released while is kmapped and pinned, or BO MIGHT be deref
>>>>>>> > multiple times, etc.
>>>>>>>          r = amdgpu_bo_reserve(adev->vram_scratch.robj, false);
>>>>>>> we have specified interruptible to false, so -ERESTARTSYS isn't
>>>>>>> possible
>>>>>>> here.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David Zhou
>>>>>>> >
>>>>>>> > Change-Id: If76071a768950a0d3ad9d5da7fcae04881807621
>>>>>>> > Signed-off-by: Alex Xie <AlexBin.Xie at amd.com>
>>>>>>> > ---
>>>>>>> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>>>>>>> >   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>> >
>>>>>>> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> > index 53996e3..1dcc2d1 100644
>>>>>>> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> > @@ -355,8 +355,8 @@ static void
>>>>>>> amdgpu_vram_scratch_fini(struct amdgpu_device *adev)
>>>>>>> >                amdgpu_bo_kunmap(adev->vram_scratch.robj);
>>>>>>> >                amdgpu_bo_unpin(adev->vram_scratch.robj);
>>>>>>> >                amdgpu_bo_unreserve(adev->vram_scratch.robj);
>>>>>>> > +             amdgpu_bo_unref(&adev->vram_scratch.robj);
>>>>>>> >        }
>>>>>>> > -     amdgpu_bo_unref(&adev->vram_scratch.robj);
>>>>>>> >   }
>>>>>>> >  
>>>>>>> >   /**
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> amd-gfx mailing list
>>>>>> amd-gfx at lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx at lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>
>>>
>>
>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx