I would go a step further and put the BUG_ON(in_interrupt()) into amdgpu_virt_kiq_rreg(). And BTW the problem is even worse, you also can't lock a mutex inside an atomic section and we could have plenty of that as well in the driver code. Regards, Christian. Am 12.01.2017 um 04:21 schrieb Liu, Monk: > > Xiangliang > > please BUG() when register access occured in RUNTIME and IRQ context, > e.g.: > > for register read: > if (amdgpu_sriov_runtime(adev)) { > if (in_interrupt()) > BUG(); > else > return amdgpu_virt_kiq_rreg(adev, reg, v); > } > > and also for register write, with above addressed, Reviewed-by: Monk > Liu <monk.liu at amd.com> > > ------------------------------------------------------------------------ > *å??件人:* Liu, Monk > *å??é??æ?¶é?´:* 2017å¹´1æ??12æ?¥ 11:19:40 > *æ?¶ä»¶äºº:* Christian König; Yu, Xiangliang; amd-gfx at lists.freedesktop.org > *主é¢?:* ç?å¤?: [V3 04/11] drm/amdgpu/virt: use kiq to access registers > > Xiangliang > > > please BUG() when register access occured in RUNTIME and IRQ context, > e.g.: > > > if (amdgpu_sriov_runtime(adev)) { > > > } > return amdgpu_virt_kiq_wreg(adev, reg, v); > > > with above addressed, Reviewed-by: Monk Liu <monk.liu at amd.com> > > ------------------------------------------------------------------------ > *å??件人:* amd-gfx <amd-gfx-bounces at lists.freedesktop.org> 代表 Liu, Monk > <Monk.Liu at amd.com> > *å??é??æ?¶é?´:* 2017å¹´1æ??11æ?¥ 22:54:52 > *æ?¶ä»¶äºº:* Christian König; Yu, Xiangliang; amd-gfx at lists.freedesktop.org > *主é¢?:* RE: [V3 04/11] drm/amdgpu/virt: use kiq to access registers > We should BUG and not access register at all if found during RUNTIME > && IRQ_context > By far. > > BR Monk > -----Original Message----- > From: Christian König [mailto:deathsimple at vodafone.de] > Sent: Wednesday, January 11, 2017 10:44 PM > To: Liu, Monk <Monk.Liu at amd.com>; Yu, Xiangliang > <Xiangliang.Yu at amd.com>; amd-gfx at lists.freedesktop.org > Subject: Re: [V3 04/11] drm/amdgpu/virt: use kiq to access registers > > > do you think that's okay ? > Fine with me, but I'm not sure if the hypervisor doesn't gets a hickup > when a client writes to some registers from interrupt context. > > Christian. > > Am 11.01.2017 um 15:38 schrieb Liu, Monk: > > Yeah, make sense > > > > I think before we have a full version of KIQ reg accessing without > context switch, we should totally disable KIQ reg read/write in IRQ > context, do you think that's okay ? > > > > BR Monk > > > > -----Original Message----- > > From: Christian König [mailto:deathsimple at vodafone.de] > > Sent: Wednesday, January 11, 2017 10:16 PM > > To: Liu, Monk <Monk.Liu at amd.com>; Yu, Xiangliang > > <Xiangliang.Yu at amd.com>; amd-gfx at lists.freedesktop.org > > Subject: Re: [V3 04/11] drm/amdgpu/virt: use kiq to access registers > > > >> Because if we are in interrupt , we are forbid to do schedule, and > use kiq to read register will invoke fence_wait() ... > > That won't work. Locking a mutext like it is done in the write path > can cause scheduling as well. > > > > If we need to push writes through the KIQ in interrupt context we > need to change the code to use a spinlock with disabled interrupts > instead of a mutex. > > > > Otherwise the whole thing can easily deadlock when an interrupt > happens to come while the lock is taken. > > > > Please also test the patchset with lockdep enabled, that should > yield a bunch of error messages when you try to do something like this. > > > > Regards, > > Christian. > > > > Am 11.01.2017 um 14:51 schrieb Liu, Monk: > >> Because if we are in interrupt , we are forbid to do schedule, and > use kiq to read register will invoke fence_wait() ... > >> > >> If you think that's odds, we can use BUG_ON() to stop driver > running if we need read registers in SRIOV case. > >> > >> And to fully support register reading, we need to implement a new > >> method to use KIQ access register without any schedule() chance, but > >> that's overhead for current stage, and we don't observe registers > >> reading in IRQ context now on 3D apps. (kfd observed such case, and > >> @Shaoyun dispatches a kernel thread/tasklet in IRQ to do the register > >> dumping, instead of directly reading them via KIQ ) > >> > >> BR Monk > >> > >> -----Original Message----- > >> From: Christian König [mailto:deathsimple at vodafone.de] > >> Sent: Wednesday, January 11, 2017 9:38 PM > >> To: Yu, Xiangliang <Xiangliang.Yu at amd.com>; > >> amd-gfx at lists.freedesktop.org > >> Cc: Liu, Monk <Monk.Liu at amd.com> > >> Subject: Re: [V3 04/11] drm/amdgpu/virt: use kiq to access registers > >> > >> Am 11.01.2017 um 14:18 schrieb Xiangliang Yu: > >>> For virtualization, it is must for driver to use KIQ to access > >>> registers when it is out of GPU full access mode. > >>> > >>> Signed-off-by: Xiangliang Yu <Xiangliang.Yu at amd.com> > >>> Signed-off-by: Monk Liu <Monk.Liu at amd.com> > >>> --- > >>> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- > >>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++ > >>> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 82 > ++++++++++++++++++++++++++++++ > >>> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 5 ++ > >>> drivers/gpu/drm/amd/amdgpu/vi.c | 3 ++ > >>> 5 files changed, 97 insertions(+), 1 deletion(-) > >>> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c > >>> > >>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile > >>> b/drivers/gpu/drm/amd/amdgpu/Makefile > >>> index 4185b03..0b8e470 100644 > >>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile > >>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile > >>> @@ -30,7 +30,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ > >>> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ > >>> amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \ > >>> amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ > >>> - amdgpu_gtt_mgr.o amdgpu_vram_mgr.o > >>> + amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o > >>> > >>> # add asic specific block > >>> amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o > >>> kv_dpm.o \ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > >>> index f82919d..9a2fd3e 100644 > >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > >>> @@ -95,6 +95,9 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device > *adev, uint32_t reg, > >>> { > >>> uint32_t ret; > >>> > >>> + if (amdgpu_sriov_runtime(adev) && !in_interrupt()) > >>> + return amdgpu_virt_kiq_rreg(adev, reg); > >>> + > >> Mhm, why is the in_interrupt() necessary here? > >> > >>> if ((reg * 4) < adev->rmmio_size && !always_indirect) > >>> ret = readl(((void __iomem *)adev->rmmio) + (reg * 4)); > >>> else { > >>> @@ -114,6 +117,9 @@ void amdgpu_mm_wreg(struct amdgpu_device > *adev, uint32_t reg, uint32_t v, > >>> { > >>> trace_amdgpu_mm_wreg(adev->pdev->device, reg, v); > >>> > >>> + if (amdgpu_sriov_runtime(adev)) > >>> + return amdgpu_virt_kiq_wreg(adev, reg, v); > >>> + > >> And why it isn't necessary here? What happens when we write a > register from an interrupt routine? > >> > >> Christian. > >> > >>> if ((reg * 4) < adev->rmmio_size && !always_indirect) > >>> writel(v, ((void __iomem *)adev->rmmio) + (reg * 4)); > >>> else { > >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c > >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c > >>> new file mode 100644 > >>> index 0000000..7861b6b > >>> --- /dev/null > >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c > >>> @@ -0,0 +1,82 @@ > >>> +/* > >>> + * Copyright 2017 Advanced Micro Devices, Inc. > >>> + * > >>> + * Permission is hereby granted, free of charge, to any person > >>> +obtaining a > >>> + * copy of this software and associated documentation files (the > >>> +"Software"), > >>> + * to deal in the Software without restriction, including without > >>> +limitation > >>> + * the rights to use, copy, modify, merge, publish, distribute, > >>> +sublicense, > >>> + * and/or sell copies of the Software, and to permit persons to > >>> +whom the > >>> + * Software is furnished to do so, subject to the following > conditions: > >>> + * > >>> + * The above copyright notice and this permission notice shall be > >>> +included in > >>> + * all copies or substantial portions of the Software. > >>> + * > >>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > >>> +EXPRESS OR > >>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > >>> +MERCHANTABILITY, > >>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO > >>> +EVENT SHALL > >>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, > >>> +DAMAGES OR > >>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR > >>> +OTHERWISE, > >>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE > >>> +USE OR > >>> + * OTHER DEALINGS IN THE SOFTWARE. > >>> + */ > >>> + > >>> +#include "amdgpu.h" > >>> +#include "amdgpu_virt.h" > >>> + > >>> +void amdgpu_virt_init_setting(struct amdgpu_device *adev) { > >>> + mutex_init(&adev->virt.lock); > >>> +} > >>> + > >>> +uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t > >>> +reg) { > >>> + signed long r; > >>> + uint32_t val; > >>> + struct fence *f; > >>> + struct amdgpu_kiq *kiq = &adev->gfx.kiq; > >>> + struct amdgpu_ring *ring = &kiq->ring; > >>> + > >>> + BUG_ON(!ring->funcs->emit_rreg); > >>> + > >>> + mutex_lock(&adev->virt.lock); > >>> + amdgpu_ring_alloc(ring, 32); > >>> + amdgpu_ring_emit_hdp_flush(ring); > >>> + amdgpu_ring_emit_rreg(ring, reg); > >>> + amdgpu_ring_emit_hdp_invalidate(ring); > >>> + amdgpu_fence_emit(ring, &f); > >>> + amdgpu_ring_commit(ring); > >>> + mutex_unlock(&adev->virt.lock); > >>> + > >>> + r = fence_wait(f, false); > >>> + if (r) > >>> + DRM_ERROR("wait for kiq fence error: %ld.\n", r); > >>> + fence_put(f); > >>> + > >>> + val = adev->wb.wb[adev->virt.reg_val_offs]; > >>> + > >>> + return val; > >>> +} > >>> + > >>> +void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, > >>> +uint32_t v) { > >>> + signed long r; > >>> + struct fence *f; > >>> + struct amdgpu_kiq *kiq = &adev->gfx.kiq; > >>> + struct amdgpu_ring *ring = &kiq->ring; > >>> + > >>> + BUG_ON(!ring->funcs->emit_wreg); > >>> + > >>> + mutex_lock(&adev->virt.lock); > >>> + amdgpu_ring_alloc(ring, 32); > >>> + amdgpu_ring_emit_hdp_flush(ring); > >>> + amdgpu_ring_emit_wreg(ring, reg, v); > >>> + amdgpu_ring_emit_hdp_invalidate(ring); > >>> + amdgpu_fence_emit(ring, &f); > >>> + amdgpu_ring_commit(ring); > >>> + mutex_unlock(&adev->virt.lock); > >>> + > >>> + r = fence_wait(f, false); > >>> + if (r) > >>> + DRM_ERROR("wait for kiq fence error: %ld.\n", r); > >>> + fence_put(f); > >>> +} > >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h > >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h > >>> index 3d38a9f..2869980 100644 > >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h > >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h > >>> @@ -33,6 +33,7 @@ > >>> struct amdgpu_virt { > >>> uint32_t caps; > >>> uint32_t reg_val_offs; > >>> + struct mutex lock; > >>> }; > >>> > >>> #define amdgpu_sriov_enabled(adev) \ @@ -59,4 +60,8 @@ static > >>> inline bool is_virtual_machine(void) > >>> #endif > >>> } > >>> > >>> +void amdgpu_virt_init_setting(struct amdgpu_device *adev); uint32_t > >>> +amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg); > >>> +void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, > >>> +uint32_t v); > >>> + > >>> #endif > >>> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c > >>> b/drivers/gpu/drm/amd/amdgpu/vi.c index 7350a8f..dc0d4fa 100644 > >>> --- a/drivers/gpu/drm/amd/amdgpu/vi.c > >>> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c > >>> @@ -892,6 +892,9 @@ static int vi_common_early_init(void *handle) > >>> (amdgpu_ip_block_mask & (1 << AMD_IP_BLOCK_TYPE_SMC))) > >>> smc_enabled = true; > >>> > >>> + if (amdgpu_sriov_vf(adev)) > >>> + amdgpu_virt_init_setting(adev); > >>> + > >>> adev->rev_id = vi_get_rev_id(adev); > >>> adev->external_rev_id = 0xFF; > >>> switch (adev->asic_type) { > >> _______________________________________________ > >> amd-gfx mailing list > >> amd-gfx at lists.freedesktop.org > >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > > > _______________________________________________ > > amd-gfx mailing list > > amd-gfx at lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20170112/bfa855b4/attachment-0001.html>