On Mon, 15 May 2023 16:49:48 +0200, Amadeusz Sławiński wrote: > > On 5/15/2023 3:02 PM, Takashi Iwai wrote: > > On Mon, 15 May 2023 13:19:29 +0200, > > Amadeusz Sławiński wrote: > >> > >> On 5/12/2023 2:24 PM, Takashi Iwai wrote: > >>> On Fri, 12 May 2023 14:00:54 +0200, > >>> Amadeusz Sławiński wrote: > >>>> > >>>> On 5/12/2023 1:33 PM, Takashi Iwai wrote: > >>>>> On Fri, 12 May 2023 13:23:49 +0200, > >>>>> Takashi Iwai wrote: > >>>>>> > >>>>>> On Thu, 11 May 2023 19:20:17 +0200, > >>>>>> Amadeusz Sławiński wrote: > >>>>>>> > >>>>>>> On 5/11/2023 5:58 PM, Takashi Iwai wrote: > >>>>>>>> On Thu, 11 May 2023 17:31:37 +0200, > >>>>>>>> Amadeusz Sławiński wrote: > >>>>>>>>> > >>>>>>>>> On 5/10/2023 2:21 PM, Takashi Iwai wrote: > >>>>>>>>>> On Tue, 09 May 2023 12:10:06 +0200, > >>>>>>>>>> Amadeusz Sławiński wrote: > >>>>>>>>> Then capture stream starts and seems to assume that > >>>>>>>>> registers were already set, so it doesn't write them to hw. > >>>>>>>> > >>>>>>>> ... it seems this didn't happen, and that's the inconsistency. > >>>>>>>> > >>>>>>>> So the further question is: > >>>>>>>> At the point just before you start recording, is the codec in runtime > >>>>>>>> suspended? Or it's running? > >>>>>>>> > >>>>>>>> If it's runtime-suspended, snd_hda_regmap_sync() must be called from > >>>>>>>> alc269_resume() via runtime-resume, and this must write out the > >>>>>>>> cached values. Then the bug can be along with that line. > >>>>>>>> > >>>>>>>> Or if it's running, it means that the previous check of > >>>>>>>> snd_hdac_keep_power_up() was bogus (or racy). > >>>>>>>> > >>>>>>> > >>>>>>> Well, it is in... let's call it semi powered state. When snd_hda_intel > >>>>>>> driver is loaded with power_save=X option it sets timeout to X seconds > >>>>>>> and problem only happens when I start the stream before those X > >>>>>>> seconds pass and it runs first runtime suspend. After it suspends it > >>>>>>> then uses standard pm_runtime_resume and works correctly. That's why > >>>>>>> the pm_runtime_force_suspend(&codec->core.dev); mentioned in first > >>>>>>> email in thread "fixes" the problem, as it forces it to be instantly > >>>>>>> suspended instead of waiting for timeout and then later normal > >>>>>>> resume-play/record-suspend flow can be followed. > >>>>>> > >>>>>> Hm, then maybe it's a bad idea to rely on the usage count there. > >>>>>> Even if the usage is 0, the device can be still active, and the update > >>>>>> can be missed. > >>>>>> > >>>>>> How about the patch like below? > >>>>> > >>>>> Scratch that, it returns a wrong value. > >>>>> A simpler version like below works instead? > >>>>> > >>>> > >>>> Yes it was broken, arecord didn't even start capturing ;) > >>>> > >>>>> > >>>>> Takashi > >>>>> > >>>>> --- a/sound/hda/hdac_device.c > >>>>> +++ b/sound/hda/hdac_device.c > >>>>> @@ -611,10 +611,9 @@ EXPORT_SYMBOL_GPL(snd_hdac_power_up_pm); > >>>>> int snd_hdac_keep_power_up(struct hdac_device *codec) > >>>>> { > >>>>> if (!atomic_inc_not_zero(&codec->in_pm)) { > >>>>> - int ret = pm_runtime_get_if_in_use(&codec->dev); > >>>>> - if (!ret) > >>>>> + if (!pm_runtime_active(&codec->dev)) > >>>>> return -1; > >>>>> - if (ret < 0) > >>>>> + if (pm_runtime_get_sync(&codec->dev) < 0) > >>>>> return 0; > >>>>> } > >>>>> return 1; > >>>> > >>>> > >>>> This one seems to work, as in I'm able to record before first suspend > >>>> hits. However device stays in D0 when no stream is running... > >>>> # cat /sys/devices/pci0000\:00/0000\:00\:0e.0/power_state > >>>> D0 > >>> > >>> OK, one step forward. The previous change was bad in anyway, as we > >>> shouldn't sync there at all. > >>> > >>> So, the problem becomes clearer now: it's in the lazy update mechanism > >>> that misses the case that has to be written. > >>> > >>> Scratch the previous one again, and could you try the following one > >>> instead? > >>> > >>> > >>> Takashi > >>> > >>> --- a/sound/hda/hdac_regmap.c > >>> +++ b/sound/hda/hdac_regmap.c > >>> @@ -293,8 +293,17 @@ static int hda_reg_write(void *context, unsigned int reg, unsigned int val) > >>> if (verb != AC_VERB_SET_POWER_STATE) { > >>> pm_lock = codec_pm_lock(codec); > >>> - if (pm_lock < 0) > >>> - return codec->lazy_cache ? 0 : -EAGAIN; > >>> + if (pm_lock < 0) { > >>> + /* skip the actual write if it's in lazy-update mode > >>> + * and only if the device is actually suspended; > >>> + * the usage count can be zero at transition phase > >>> + * (either suspending/resuming or auto-suspend sleep) > >>> + */ > >>> + if (codec->lazy_cache && > >>> + pm_runtime_suspended(&codec->dev)) > >>> + return 0; > >>> + return -EAGAIN; > >>> + } > >>> } > >>> if (is_stereo_amp_verb(reg)) { > >>> > >> > >> With this one we are back to same behavior as without it. When capture > >> is started before first suspend it records silence. After waiting for > >> timeout and suspend it records correctly. > > > > Hm, interesting. Does it mean that the pm_runtime_get_if_in_use() (in > > snd_hdac_keep_power_up()) returns a non-zero value? > > Or is pm_runtime_suspended() returns really true there? > > > > > > So I've tested with vanilla kernel, where pm_runtime_get_if_in_use > returns -22 until loaded and then 13 times "0" until arecord. > > With above patch it returns 13 times "0" and then one more time "1". > > pm_runtime_suspended() returns 0 (needed to modify patch a bit) > > Patch: > > diff --git a/sound/hda/hdac_device.c b/sound/hda/hdac_device.c > index 035b720bf602..62880952e398 100644 > --- a/sound/hda/hdac_device.c > +++ b/sound/hda/hdac_device.c > @@ -612,6 +612,7 @@ int snd_hdac_keep_power_up(struct hdac_device *codec) > { > if (!atomic_inc_not_zero(&codec->in_pm)) { > int ret = pm_runtime_get_if_in_use(&codec->dev); > + pr_err("DEBUG:%s:%d %s ret=%d\n", __FILE__, __LINE__, > __func__, ret); > if (!ret) > return -1; > if (ret < 0) > diff --git a/sound/hda/hdac_regmap.c b/sound/hda/hdac_regmap.c > index fe3587547cfe..d6cf3fa2d4e7 100644 > --- a/sound/hda/hdac_regmap.c > +++ b/sound/hda/hdac_regmap.c > @@ -293,8 +293,19 @@ static int hda_reg_write(void *context, unsigned > int reg, unsigned int val) > > if (verb != AC_VERB_SET_POWER_STATE) { > pm_lock = codec_pm_lock(codec); > - if (pm_lock < 0) > - return codec->lazy_cache ? 0 : -EAGAIN; > + if (pm_lock < 0) { > + bool x; > + /* skip the actual write if it's in lazy-update mode > + * and only if the device is actually suspended; > + * the usage count can be zero at transition phase > + * (either suspending/resuming or auto-suspend > sleep) > + */ > + x = pm_runtime_suspended(&codec->dev); > + pr_err("DEBUG: %s:%d x = %d\n", __FILE__, > __LINE__, x); > + if (codec->lazy_cache && x) > + return 0; > + return -EAGAIN; > + } > } > > if (is_stereo_amp_verb(reg)) { > > > Part of vanilla dmesg (contains only first chunk): (snip) > Part of fully patched dmesg: (snip) > [ 79.556191] DEBUG:sound/hda/hdac_device.c:615 > snd_hdac_keep_power_up ret=0 > [ 79.556234] DEBUG:sound/hda/hdac_device.c:615 > snd_hdac_keep_power_up ret=0 If here ret==0, hdac_keep_power_up() should return -1, and it's the return value of codec_pm_lock(). So it must print out the value of "x" (pm_runtime_suspend() result), but I don't see it. What's the missing piece...? > I think there are two problems: > > 1. After probe codec is powered down > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/sound/pci/hda/hda_codec.c#n833), > even though according to power management it is still running I guess it's in the auto-suspend state, so it's still not suspended but the device itself is active, while the usage count is 0. That's fine, and I thought my second patch handling it. That is, if the usage count is 0 and the device is not suspended, it should return -EAGAIN and make the caller retry with the full power up. The code path is with CALL_RUN_FUNC() macro in hdac_regmap.c, and with -EAGAIN return value, it tries snd_hdac_power_up_pm() and call the function again. > 2. When stream is started before first suspend, resume function > doesn't run and it is a function which syncs cached registers. By > resume function I mean > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/sound/pci/hda/hda_codec.c#n2899 > which calls snd_hda_regmap_sync() or through in case of the platform I > test it on codec->patch_ops.resume(codec) -> alc269_resume, which also > calls snd_hda_regmap_sync(). It's also expected, per se. Since it's been not suspended, it assumes that the value got already written, and no resume is needed. Takashi