On Fri, 2022-01-14 at 15:30 -0600, Stephen Boyd wrote: > Quoting Yong Wu (2022-01-14 01:06:31) > > On Wed, 2022-01-12 at 20:25 -0800, Stephen Boyd wrote: > > > > > > > > [ 2.654526] ------------[ cut here ]------------ > > > > [ 2.655558] refcount_t: addition on 0; use-after-free. > > > > > > > > After this patch, the aggregate_driver flow looks ok. But our > > > > driver > > > > still aborts like this: > > > > > > > > [ 2.721316] Unable to handle kernel NULL pointer dereference > > > > at > > > > virtual address 0000000000000000 > > > > ... > > > > [ 2.731658] pc : > > > > mtk_smi_larb_config_port_gen2_general+0xa4/0x138 > > > > [ 2.732434] lr : mtk_smi_larb_resume+0x54/0x98 > > > > ... > > > > [ 2.742457] Call trace: > > > > [ 2.742768] mtk_smi_larb_config_port_gen2_general+0xa4/0x13 > > > > 8 > > > > [ 2.743496] pm_generic_runtime_resume+0x2c/0x48 > > > > [ 2.744090] __genpd_runtime_resume+0x30/0xa8 > > > > [ 2.744648] genpd_runtime_resume+0x94/0x2c8 > > > > [ 2.745191] __rpm_callback+0x44/0x150 > > > > [ 2.745669] rpm_callback+0x6c/0x78 > > > > [ 2.746114] rpm_resume+0x314/0x558 > > > > [ 2.746559] __pm_runtime_resume+0x3c/0x88 > > > > [ 2.747080] pm_runtime_get_suppliers+0x7c/0x110 > > > > [ 2.747668] __driver_probe_device+0x4c/0xe8 > > > > [ 2.748212] driver_probe_device+0x44/0x130 > > > > [ 2.748745] __device_attach_driver+0x98/0xd0 > > > > [ 2.749300] bus_for_each_drv+0x68/0xd0 > > > > [ 2.749787] __device_attach+0xec/0x148 > > > > [ 2.750277] device_attach+0x14/0x20 > > > > [ 2.750733] bus_rescan_devices_helper+0x50/0x90 > > > > [ 2.751319] bus_for_each_dev+0x7c/0xd8 > > > > [ 2.751806] bus_rescan_devices+0x20/0x30 > > > > [ 2.752315] __component_add+0x7c/0xa0 > > > > [ 2.752795] component_add+0x14/0x20 > > > > [ 2.753253] mtk_smi_larb_probe+0xe0/0x120 > > > > > > > > This is because the device runtime_resume is called before the > > > > bind > > > > operation(In our case this detailed function is > > > > mtk_smi_larb_bind). > > > > The issue doesn't happen without this patchset. I'm not sure > > > > the > > > > right > > > > sequence. If we should fix in mediatek driver, the patch could > > > > be: > > > > > > Oh, the runtime PM is moved around with these patches. The > > > aggregate > > > device is runtime PM enabled before the probe is called, > > > > In our case, the component device may probe before the aggregate > > device. thus the component device runtime PM has already been > > enabled > > when aggregate device probe. > > This is always the case. The component device always probes before > the > aggregate device because the component device registers with the > component framework. But the component device can decide to enable > runtime PM during driver probe or during component bind. > > > > > > and there are > > > supplier links made to each component, so each component is > > > runtime > > > resumed before the aggregate probe function is called. > > > > Yes. This is the current flow. > > Got it. > > > > > > It means that all > > > the component drivers need to have their resources ready to power > > > on > > > before their component_bind() callback is made. > > > > Sorry, I don't understand here well. In this case, The component > > drivers prepare the resource for power on in the component_bind > > since > > the resource comes from the aggregate driver. Thus, we expect the > > component_bind run before the runtime resume callback. > > What resource comes from the aggregate device that the component > device manages in runtime PM? Here the aggregate device is the iommu device. It knows who enabled the iommu from the dtsi, then transfers this information to the component(smi_larb) devices. smi_larb will configure the registers to enable iommu for these masters in the runtime resume. > > > > Another solution is moving the component's pm_runtime_enable into > > the > > component_bind(It's mtk_smi_larb_bind here), then the runtime > > callback > > is called after component_bind in which the resource for power on > > is > > ready. > > This sounds more correct to me. I'm not an expert on runtime PM > though > as I always have to read the code to remember how it works. if the > device isn't ready for runtime PM until the component bind function > is > called then runtime PM shouldn't be enabled on the component device. Anyway, We should update the SMI driver for this case. I prepare a patch into this patchset or I send it independently? which way is better? The patch could be: diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c index b883dcc0bbfa..88854c2270a1 100644 --- a/drivers/memory/mtk-smi.c +++ b/drivers/memory/mtk-smi.c @@ -175,6 +175,8 @@ mtk_smi_larb_bind(struct device *dev, struct device *master, void *data) larb->larbid = i; larb->mmu = &larb_mmu[i].mmu; larb->bank = larb_mmu[i].bank; + + pm_runtime_enable(dev); return 0; } } @@ -450,7 +452,6 @@ static int mtk_smi_larb_probe(struct platform_device *pdev) if (ret < 0) return ret; - pm_runtime_enable(dev); platform_set_drvdata(pdev, larb); ret = component_add(dev, &mtk_smi_larb_component_ops); if (ret) Thanks.