25.10.2019 14:48, Thierry Reding пишет: > On Thu, Oct 24, 2019 at 09:46:58PM +0300, Dmitry Osipenko wrote: >> 24.10.2019 20:28, Thierry Reding пишет: >>> On Thu, Oct 24, 2019 at 07:31:19PM +0300, Dmitry Osipenko wrote: >>>> 24.10.2019 19:21, Dmitry Osipenko пишет: >>>>> 24.10.2019 19:09, Dmitry Osipenko пишет: >>>>>> 24.10.2019 18:57, Dmitry Osipenko пишет: >>>>>>> 24.10.2019 18:56, Thierry Reding пишет: >>>>>>>> On Thu, Oct 24, 2019 at 06:47:23PM +0300, Dmitry Osipenko wrote: >>>>>>>>> 24.10.2019 16:50, Thierry Reding пишет: >>>>>>>>>> On Thu, Oct 24, 2019 at 04:28:41PM +0300, Dmitry Osipenko wrote: >>>>>>>>>>> 24.10.2019 14:58, Thierry Reding пишет: >>>>>>>>>>>> On Sun, Jun 23, 2019 at 08:37:42PM +0300, Dmitry Osipenko wrote: >>>>>>>>>>>>> This should should fire up on the DRM's driver module re-loader because >>>>>>>>>>>>> there won't be enough available domains on older Tegra SoCs. >>>>>>>>>>>>> >>>>>>>>>>>>> Cc: stable <stable@xxxxxxxxxxxxxxx> >>>>>>>>>>>>> Fixes: 0c407de5ed1a ("drm/tegra: Refactor IOMMU attach/detach") >>>>>>>>>>>>> Signed-off-by: Dmitry Osipenko <digetx@xxxxxxxxx> >>>>>>>>>>>>> --- >>>>>>>>>>>>> drivers/gpu/drm/tegra/dc.c | 4 ++-- >>>>>>>>>>>>> drivers/gpu/drm/tegra/drm.c | 9 ++++++--- >>>>>>>>>>>>> drivers/gpu/drm/tegra/drm.h | 3 ++- >>>>>>>>>>>>> drivers/gpu/drm/tegra/gr2d.c | 4 ++-- >>>>>>>>>>>>> drivers/gpu/drm/tegra/gr3d.c | 4 ++-- >>>>>>>>>>>>> 5 files changed, 14 insertions(+), 10 deletions(-) >>>>>>>>>>>> >>>>>>>>>>>> I think I understand what this is trying to do, but the commit message >>>>>>>>>>>> does not help at all. So what's really going on here is that we need to >>>>>>>>>>>> detach the device from the group regardless of whether we're sharing the >>>>>>>>>>>> group or not, just like we attach groups to the shared domain whether >>>>>>>>>>>> they share the same group or not. >>>>>>>>>>> >>>>>>>>>>> Yes, the commit's message could be improved. >>>>>>>>>>> >>>>>>>>>>>> But in that case, I wonder if it's even worth splitting groups the way >>>>>>>>>>>> we are right now. Wouldn't it be better to just put all the devices into >>>>>>>>>>>> the same group and be done with it? >>>>>>>>>>>> >>>>>>>>>>>> The current code gives me headaches every time I read it, so if we can >>>>>>>>>>>> just make it so that all the devices under the DRM device share the same >>>>>>>>>>>> group, this would become a lot easier to deal with. I'm not really >>>>>>>>>>>> convinced that it makes much sense to keep them on separate domains, >>>>>>>>>>>> especially given the constraints on the number of domains available on >>>>>>>>>>>> earlier Tegra devices. >>>>>>>>>>>> >>>>>>>>>>>> Note that sharing a group will also make it much easier for these to use >>>>>>>>>>>> the DMA API if it is backed by an IOMMU. >>>>>>>>>>> >>>>>>>>>>> Probably I'm blanking on everything about IOMMU now.. could you please >>>>>>>>>>> remind me what "IOMMU group" is? >>>>>>>>>>> >>>>>>>>>>> Isn't it that each IOMMU group relates to the HW ID (SWGROUP)? But then >>>>>>>>>>> each display controller has its own SWGROUP.. and thus that sharing just >>>>>>>>>>> doesn't make any sense, hm. >>>>>>>>>> >>>>>>>>>> IOMMU groups are not directly related to SWGROUPs. But by default the >>>>>>>>>> IOMMU framework will share a domain between members of the same IOMMU >>>>>>>>>> group. >>>>>>>>> >>>>>>>>> Ah, I re-figured out that again. The memory controller drivers are >>>>>>>>> defining a single "IOMMU group" for both of the display controllers. >>>>>>>>> >>>>>>>>>> Seems like that's really what we want here, so that when we do >>>>>>>>>> use the DMA API, all the devices part of the DRM device get attached to >>>>>>>>>> the same IOMMU domain, yet if we don't want to use the DMA API we only >>>>>>>>>> need to detach the one group from the backing. >>>>>>>>> >>>>>>>>> Yes, it should be okay to put all DRM devices into the same group, like >>>>>>>>> it is done now for the displays. It also should resolve problem with the >>>>>>>>> domains shortage on T30 since now there are maximum 3 domains in use: >>>>>>>>> host1x, drm and vde. >>>>>>>>> >>>>>>>>> I actually just checked that the original problem still exists >>>>>>>>> and this change solves it as well: >>>>>>>>> >>>>>>>>> --- >>>>>>>>> diff --git a/drivers/memory/tegra/tegra30.c b/drivers/memory/tegra/tegra30.c >>>>>>>>> index 5a0f6e0a1643..e71096498436 100644 >>>>>>>>> --- a/drivers/memory/tegra/tegra30.c >>>>>>>>> +++ b/drivers/memory/tegra/tegra30.c >>>>>>>>> @@ -1021,6 +1021,9 @@ static const struct tegra_smmu_swgroup >>>>>>>>> tegra30_swgroups[] = { >>>>>>>>> static const unsigned int tegra30_group_display[] = { >>>>>>>>> TEGRA_SWGROUP_DC, >>>>>>>>> TEGRA_SWGROUP_DCB, >>>>>>>>> + TEGRA_SWGROUP_G2, >>>>>>>>> + TEGRA_SWGROUP_NV, >>>>>>>>> + TEGRA_SWGROUP_NV2, >>>>>>>>> }; >>>>>>>>> >>>>>>>>> static const struct tegra_smmu_group_soc tegra30_groups[] = { >>>>>>>>> --- >>>>>>>>> >>>>>>>>> Please let me know whether you're going to make a patch or if I should >>>>>>>>> do it. >>>>>>>> >>>>>>>> I've been testing with a similar change and couldn't find any >>>>>>>> regressions. I've also made the same modifications for Tegra114 and >>>>>>>> Tegra124. >>>>>>>> >>>>>>>> Are you saying that none of these patches are needed anymore? Or do we >>>>>>>> still need a patch to fix detaching? I'm thinking that maybe we can >>>>>>>> drastrically simplify the detachment now by dropping the shared >>>>>>>> parameter altogether. >>>>>>>> >>>>>>>> Let me draft a patch and send out the whole set for testing. >>>>>>> >>>>>>> Seems it's still not ideal because I noticed this in KMSG: >>>>>>> >>>>>>> [ 0.703185] Failed to attached device 54200000.dc to IOMMU_mapping >>>>>>> [ 0.710404] Failed to attached device 54240000.dc to IOMMU_mapping >>>>>>> [ 0.719347] Failed to attached device 54140000.gr2d to IOMMU_mapping >>>>>>> [ 0.719569] Failed to attached device 54180000.gr3d to IOMMU_mapping >>>>>>> >>>>>>> which comes from the implicit IOMMU backing. >>>>>> >>>>>> And the error comes from here: >>>>>> >>>>>> https://elixir.bootlin.com/linux/v5.4-rc2/source/drivers/iommu/iommu.c#L1655 >>>>> >>>>> So the detaching still should be needed, but at the moment the ARM32 >>>>> DMA-mapping code is simply not suitable for the case of having multiple >>>>> devices in the same group. I'm wondering whether there are any real >>>>> users for the implicit IOMMU backing on ARM32 at all :/ >>>>> >>>> >>>> Apparently the "Failed to attached device 54200000.dc" was always in the >>>> log (I rarely testing the default multi-platform config), it's just the >>>> message is a pr_warn that I wasn't paying attention because it is >>>> colored like pr_info in dmesg :) >>> >>> Yeah, so the above isn't a complete solution. In order to actually use >>> the DMA API backed by an IOMMU, some additional patches are needed. I >>> have all of those in a local tree and I've already sent out a couple of >>> them. It's taking a while because they all need to be applied in small >>> iterations to make sure things don't break midway. >> >> I'd like to have an immediate interim solution. > > To clarify: when I said "isn't a complete solution", what I meant is > that it's not a complete solution to make the implicit IOMMU backing > work with the DMA API. That's what I've got a patch set ready for. Okay. > But you said earlier that this change (i.e. putting all DRM devices into > the same IOMMU group) fixes the issue that you were seeing, right? So > that would be an immediate, interim solution, wouldn't it? Yes, it would.