Re: pcm_meter.c issue at s16_update

Pavel Hofman <pavel.hofman@xxxxxxxxxxx> · Sun, 9 Aug 2020 23:05:21 +0200

Dne 09. 08. 20 v 22:29 Jaroslav Kysela napsal(a):
Dne 09. 08. 20 v 9:05 Pavel Hofman napsal(a):
Dne 03. 08. 20 v 12:48 Pavel Hofman napsal(a):


Dne 03. 08. 20 v 9:22 Jaroslav Kysela napsal(a):
Dne 03. 08. 20 v 8:17 Takashi Iwai napsal(a):
On Sun, 02 Aug 2020 19:50:44 +0200,

Optionally the second case could be handled just like the first
case by
resetting s16->old, assuming the boundary wrap occurs very
infrequently.

The following patch is tested to work OK, no CPU peaks and no meter
output glitches when the size < 0 condition occurs:

diff --git a/src/pcm/pcm_meter.c b/src/pcm/pcm_meter.c
index 20b41876..48df5945 100644
--- a/src/pcm/pcm_meter.c
+++ b/src/pcm/pcm_meter.c
@@ -1098,8 +1098,15 @@ static void s16_update(snd_pcm_scope_t *scope)
          snd_pcm_sframes_t size;
          snd_pcm_uframes_t offset;
          size = meter->now - s16->old;
-       if (size < 0)
-               size += spcm->boundary;
+       if (size < 0) {
+               /**
+                * Application pointer adjusted for delay (meter->now)
has dropped compared
+                * to the previous update cycle. Either spcm->boundary
wraparound, pcm rewinding,
+                * or pcm restart without s16->old properly reset.
+                * In any case the safest solution is skipping this
conversion cycle.
+                */
+               size = 0;
+       }
          offset = s16->old % meter->buf_size;
          while (size > 0) {
                  snd_pcm_uframes_t frames = size;



Please will you accept this (workaround) bugfix? If so, I would send a
proper patch.

It looks OK, at least this must be safe.
So yes, I'll happily apply if you submit a proper patch.

It would be probably better to check against the boundary / 2 value to
check
correctly the boundary wrap instead to drop all negative size values:

    if (size < 0) {
       if (size < -(spcm->boundary / 2))
          size += spcm->boundary;
       else
          size = 0;
    }

Is there a reliable way to detect the boundary wraparound, at best using
some dedicated API? I could find any, IMO the wraparound does not create
any notification. The check is OK for a rewind, half of boundary is
usually a very large number too. I am not sure what would happen at
reset when application pointer was already past the boundary half - see
below.

Yes, it's a good argument. In this case, the s16->old value is not properly
synced during the reset operation, otherwise the boundary / 2 threshold
(change limit) is sufficient to detect the boundary wrap.

The "hidden" pcm restart referred in the comment should not occur,
otherwise
it's another bug somewhere.

I do not know the exact moments when plugin API methods are called. The
fact is Takashi's suggestion to call s16 reset explicitely in
snd_pcm_meter_reset created this order:

snd_pcm_meter_reset -> s16->reset
s16_update: meter->now 22751, s16->old 22751, size 0
s16_update: meter->now 839, s16->old 22751, size -21912

I.e. AFTER resetting meter/s16 the variable meter->now was still at the
original large 22751 (with s16->old equal to its value due to
s16->reset). The value of meter->now was reset to 839 (= app pointer -
delay) only in the next call of s16_update (when s16->old was still the
previous old value => size < 0 => huge size => high CPU load).  From
this I kind of conclude that the reset is buggy. Maybe the reset code
should re-calculate meter->now = appl.pointer - delay before aligning
s16->old = meter->now.

Nevertheless all this (except for the boundary wraparound) would result
in the same size = 0, thus skipping samples from the last cycle, just
like what the proposed patch does.



Please can we reach a decision and close the problem so that affected
use cases do not have to be patched with the next the alsa-lib version?

I think that this problem should be fixed for reset and rewind separately. The
meter->reset should be set in snd_pcm_meter_reset() inside the running_mutex
lock to serialize correctly the update operations in the
snd_pcm_meter_thread(). And perhaps, we can follow this logic for the rewind.

I mean, we should ensure to call the s16->reset at the proper time to avoid
broken old/now combinations inside the scope "clients".

Your proposed solution is just a workaround.

I am well aware of that. The main cause of the problem is that the 
existing code assumes that a drop in the meter->now value is caused by 
the pcm->boundary wraparound. Only for that particular case the existing 
 size += spcm->boundary code is correct, for all the other cases it is 
grossly wrong, locking the thread for many tens of seconds and jamming 
CPU. If there was a callback or some other way to signal the boundary 
wraparound that "dangerous" code would be called only for that special 
case (which is extremely rare in usual setups).

I do not know all cases when the meter->now can drop. Reset, rewind, any 
other (xrun)? If a single case is omitted, the same problem prevails.

No problem with resetting where appropriate, but still I would suggest 
to not keep size += spcm->boundary in the s16_update as is now.

Regards,

Pavel.