Re: [PATCH v2 5/5] iommu/arm-smmu: Convert to domain_alloc_paging()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 13 Feb 2024 at 12:20, Robin Murphy <robin.murphy@xxxxxxx> wrote:
>
> On 2024-02-13 7:51 am, Dmitry Baryshkov wrote:
> > On Sat, 10 Feb 2024 at 00:23, Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> >>
> >> On Fri, Feb 09, 2024 at 10:05:38PM +0200, Dmitry Baryshkov wrote:
> >>> On Tue, 17 Oct 2023 Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> >>>> Now that the BLOCKED and IDENTITY behaviors are managed with their own
> >>>> domains change to the domain_alloc_paging() op.
> >>>>
> >>>> The check for using_legacy_binding is now redundant,
> >>>> arm_smmu_def_domain_type() always returns IOMMU_DOMAIN_IDENTITY for this
> >>>> mode, so the core code will never attempt to create a DMA domain in the
> >>>> first place.
> >>>>
> >>>> Since commit a4fdd9762272 ("iommu: Use flush queue capability") the core
> >>>> code only passes in IDENTITY/BLOCKED/UNMANAGED/DMA domain types. It will
> >>>> not pass in IDENTITY or BLOCKED if the global statics exist, so the test
> >>>> for DMA is also redundant now too.
> >>>>
> >>>> Call arm_smmu_init_domain_context() early if a dev is available.
> >>>>
> >>>> Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
> >>>> ---
> >>>>   drivers/iommu/arm/arm-smmu/arm-smmu.c | 21 +++++++++++++++------
> >>>>   1 file changed, 15 insertions(+), 6 deletions(-)
> >>>
> >>> For some reason this patch breaks booting of the APQ8096 Dragonboard820c
> >>> (qcom/apq8096-db820c.dts). Dispbling display subsystem (mdss) and venus
> >>> devices makes the board boot in most of the cases. Most frequently the
> >>> last parts of the log loog in a following way:
> >>
> >> It is surprising we tested this patch on some tegra systems with this
> >> iommu and didn't hit anything..
> >>
> >> The only real functional thing this changes is to move the domain
> >> initialization up in time, potentially a lot in time in some
> >> cases. That function does alot of things including touching HW so
> >> possibly there is some surprising interaction with something else.
> >
> > I should not be debugging strange platforms at 1 a.m. I forgot that
> > there was another patch to revert. So after reverting the MPM patch,
> > I'm getting the following results:
> >
> >>
> >> So, I would expect this to not WARN_ON and to make it work the same as
> >> before the patch:
> >
> > No warnings, the platform now boots up to the point of actually
> > bringing up the venus device:
> >
> >
> > [   11.906514] ath10k_pci 0000:01:00.0: qca6174 hw3.2 target
> > 0x05030000 chip_id 0x00340aff sub 0000:0000
> > [   11.907119] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 0
> > tracing 0 dfs 0 testmode 0
> > [   11.915881] ath10k_pci 0000:01:00.0: firmware ver
> > WLAN.RM.4.4.1-00288- api 6 features wowlan,ignore-otp,mfp crc32
> > bf907c7c
> > [   11.979972] Console: switching to colour frame buffer device 320x90
> > [   11.990756] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id 0:1
> > crc32 d2863f91
> > [   12.060834] msm_mdp 901000.display-controller: [drm] fb0: msmdrmfb
> > frame buffer device
> > [   12.096203] qcom-pcie 608000.pcie: Phy link never came up
> > [   12.103785] qcom-pcie 608000.pcie: PCI host bridge to bus 0001:00
> > [   12.103970] qcom-venus c00000.video-codec: Adding to iommu group 3
> >
> > Format: Log Type - Time(microsec) - Message - Optional Info
> > Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic
> > S - QC_IMAGE_VERSION_STRING=BOOT.XF.1.0-00301
> > S - IMAGE_VARIANT_STRING=M8996LAB
> > S - OEM_IMAGE_VERSION_STRING=crm-ubuntu68
> > S - Boot Interface: UFS
> >
> >>
> >> Then I'd ask you to remove the comment and do:
> >>
> >> @@ -878,7 +878,9 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
> >>          if (dev) {
> >>                  struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
> >>
> >> +               WARN_ON(true);
> >>                  if (arm_smmu_init_domain_context(smmu_domain, cfg->smmu, dev)) {
> >> +                       printk("Allocation failure in arm_smmu_domain_alloc_paging()\n");
> >>                          kfree(smmu_domain);
> >>                          return NULL;
> >>                  }
> >>
> >>
> >> And then we may get a clue from the backtraces it generates. I only
> >> saw one iommu group reported in your log so I'd expect one trace?
> >
> > I added dev_info + mdelays() around the arm_smmu_init_domain_context()
> > and I can see that it crashes within that function.
>
> Yeah, this is totally broken. We can't just call the unmodified
> arm_smmu_init_domain_context() at domain allocation because half of what
> it's doing belongs to the attach operation. We should not be allocating
> context banks, IRQs, etc. for a not-yet-attached domain, and we
> certainly shouldn't be touching hardware there outside of RPM.

Should I send a revert?

>
> Thanks,
> Robin.



-- 
With best wishes
Dmitry




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [Linux for Sparc]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux