Hi again, On 2025-02-06 at 06:57:46 +0100, Frank Oltmanns <frank@xxxxxxxxxxxx> wrote: > On 2025-02-05 at 20:54:53 -0600, Bjorn Andersson <andersson@xxxxxxxxxx> wrote: >> On Wed, Feb 05, 2025 at 10:57:11PM +0100, Frank Oltmanns wrote: >>> On xiaomi-beryllium and oneplus-enchilada audio does not work reliably >>> with the in-kernel pd-mapper. Deferring the probe solves these issues. >>> Specifically, audio only works reliably with the in-kernel pd-mapper, if >>> the probe succeeds when remoteproc3 triggers the first successful probe. >>> I.e., probes from remoteproc0, 1, and 2 need to be deferred until >>> remoteproc3 has been probed. >>> >>> Introduce a device specific quirk that lists the first auxdev for which >>> the probe must be executed. Until then, defer probes from other auxdevs. >>> >>> Fixes: 1ebcde047c54 ("soc: qcom: add pd-mapper implementation") >>> Cc: stable@xxxxxxxxxxxxxxx >>> Signed-off-by: Frank Oltmanns <frank@xxxxxxxxxxxx> >>> --- >>> The in-kernel pd-mapper has been causing audio issues on sdm845 >>> devices (specifically, xiaomi-beryllium and oneplus-enchilada). I >>> observed that Stephan’s approach [1] - which defers module probing by >>> blocklisting the module and triggering a later probe - works reliably. >>> >>> Inspired by this, I experimented with delaying the probe within the >>> module itself by returning -EPROBE_DEFER in qcom_pdm_probe() until a >>> certain time (13.9 seconds after boot, based on ktime_get()) had >>> elapsed. This method also restored audio functionality. >>> >>> Further logging of auxdev->id in qcom_pdm_probe() led to an interesting >>> discovery: audio only works reliably with the in-kernel pd-mapper when >>> the first successful probe is triggered by remoteproc3. In other words, >>> probes from remoteproc0, 1, and 2 must be deferred until remoteproc3 has >>> been probed. >>> >> >> The remoteproc numbering is assigned at the time of registering each >> remoteproc driver, and does not necessarily relate to the order in which >> they are launched. That said, it sounds like what you're saying is that >> is that audio only works if we launch the pd-mapper after the >> remoteprocs has started? > > Almost, but not quite. You are right, that remoteproc3 in my setup is > always the last one that probes the pd-mapper. > > However, when experimenting with different timings I saw that the > pd-mapper really do has to respond to the probe from remoteproc3 (I'm > not sure I'm using the right terminology here, but I hope my intent > comes across). If the pd-mapper responds to remoteproc3's probe with > -EPROBE_DEFER there will again be subsequent probes from the other > remoteprocs. If we act on those probes, there is a chance that audio > (mic in my case) does not work. So, my conclusion was that remoteproc3's > probe has to be answered first before responding to the other probes. > > Further note that in my experiments remoteproc1 was always the first to > do the probing, followed by either remoteproc0 or remoteproc2. > remoteproc3 was always the last. > >> Can you please confirm which remoteproc is which in your numbering? (In >> particular, which remoteproc instance is the audio DSP?) > > remoteproc0: adsp > remoteproc1: cdsp > remoteproc2: slpi > remoteproc3: 4080000.remoteproc I'm sorry but there's one additional thing that I really should have mentioned earlier: My issue is specifically with *call* audio. Call audio is only available using out-of-tree patches. The ones I'm currently using are here: https://gitlab.com/sdm845-mainline/linux/-/commits/sdm845-6.13-rc2-r2?ref_type=tags Best regards, Frank > > (I took them from the kernel messages "remoteproc remoteproc<X>: <xyz> > is available".) > >>> To address this, I propose introducing a quirk table (which currently >>> only contains sdm845) to defer probing until the correct auxiliary >>> device (remoteproc3) initiates the probe. >>> >>> I look forward to your feedback. >>> >> >> I don't think the proposed workaround is our path forward, but I very >> much appreciate your initiative and the insights it provides! > > Thank you! I was hoping that somebody with more experience in the QCOM > universe can draw further conclusions from this. > >> Seems to >> me that we have a race-condition in the pdr helper. > > If you need further experimenting or can give me rough guidance on where > to look next, I'll be glad to help. > > Thanks again, > Frank > >> >> Regards, >> Bjorn >> >>> Thanks, >>> Frank >>> >>> [1]: https://lore.kernel.org/linux-arm-msm/Zwj3jDhc9fRoCCn6@xxxxxxxxxx/ >>> --- >>> drivers/soc/qcom/qcom_pd_mapper.c | 43 +++++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 43 insertions(+) >>> >>> diff --git a/drivers/soc/qcom/qcom_pd_mapper.c b/drivers/soc/qcom/qcom_pd_mapper.c >>> index 154ca5beb47160cc404a46a27840818fe3187420..34b26df665a888ac4872f56e948e73b561ae3b6b 100644 >>> --- a/drivers/soc/qcom/qcom_pd_mapper.c >>> +++ b/drivers/soc/qcom/qcom_pd_mapper.c >>> @@ -46,6 +46,11 @@ struct qcom_pdm_data { >>> struct list_head services; >>> }; >>> >>> +struct qcom_pdm_probe_first_dev_quirk { >>> + const char *name; >>> + u32 id; >>> +}; >>> + >>> static DEFINE_MUTEX(qcom_pdm_mutex); /* protects __qcom_pdm_data */ >>> static struct qcom_pdm_data *__qcom_pdm_data; >>> >>> @@ -526,6 +531,11 @@ static const struct qcom_pdm_domain_data *x1e80100_domains[] = { >>> NULL, >>> }; >>> >>> +static const struct qcom_pdm_probe_first_dev_quirk first_dev_remoteproc3 = { >>> + .id = 3, >>> + .name = "pd-mapper" >>> +}; >>> + >>> static const struct of_device_id qcom_pdm_domains[] __maybe_unused = { >>> { .compatible = "qcom,apq8016", .data = NULL, }, >>> { .compatible = "qcom,apq8064", .data = NULL, }, >>> @@ -566,6 +576,10 @@ static const struct of_device_id qcom_pdm_domains[] __maybe_unused = { >>> {}, >>> }; >>> >>> +static const struct of_device_id qcom_pdm_defer[] __maybe_unused = { >>> + { .compatible = "qcom,sdm845", .data = &first_dev_remoteproc3, }, >>> + {}, >>> +}; >>> static void qcom_pdm_stop(struct qcom_pdm_data *data) >>> { >>> qcom_pdm_free_domains(data); >>> @@ -637,6 +651,25 @@ static struct qcom_pdm_data *qcom_pdm_start(void) >>> return ERR_PTR(ret); >>> } >>> >>> +static bool qcom_pdm_ready(struct auxiliary_device *auxdev) >>> +{ >>> + const struct of_device_id *match; >>> + struct device_node *root; >>> + struct qcom_pdm_probe_first_dev_quirk *first_dev; >>> + >>> + root = of_find_node_by_path("/"); >>> + if (!root) >>> + return true; >>> + >>> + match = of_match_node(qcom_pdm_defer, root); >>> + of_node_put(root); >>> + if (!match) >>> + return true; >>> + >>> + first_dev = (struct qcom_pdm_probe_first_dev_quirk *) match->data; >>> + return (auxdev->id == first_dev->id) && !strcmp(auxdev->name, first_dev->name); >>> +} >>> + >>> static int qcom_pdm_probe(struct auxiliary_device *auxdev, >>> const struct auxiliary_device_id *id) >>> >>> @@ -647,6 +680,15 @@ static int qcom_pdm_probe(struct auxiliary_device *auxdev, >>> mutex_lock(&qcom_pdm_mutex); >>> >>> if (!__qcom_pdm_data) { >>> + if (!qcom_pdm_ready(auxdev)) { >>> + pr_debug("%s: Deferring probe for device %s (id: %u)\n", >>> + __func__, auxdev->name, auxdev->id); >>> + ret = -EPROBE_DEFER; >>> + goto probe_stop; >>> + } >>> + pr_debug("%s: Probing for device %s (id: %u), starting pdm\n", >>> + __func__, auxdev->name, auxdev->id); >>> + >>> data = qcom_pdm_start(); >>> >>> if (IS_ERR(data)) >>> @@ -659,6 +701,7 @@ static int qcom_pdm_probe(struct auxiliary_device *auxdev, >>> >>> auxiliary_set_drvdata(auxdev, __qcom_pdm_data); >>> >>> +probe_stop: >>> mutex_unlock(&qcom_pdm_mutex); >>> >>> return ret; >>> >>> --- >>> base-commit: 7f048b202333b967782a98aa21bb3354dc379bbf >>> change-id: 20250205-qcom_pdm_defer-3dc1271d74d9 >>> >>> Best regards, >>> -- >>> Frank Oltmanns <frank@xxxxxxxxxxxx> >>>