On Thu, 19 May 2016, at 08:02 PM, Pierre Ossman wrote: > On Wed, 18 May 2016 15:40:55 +0200 > Pierre Ossman <ossman at cendio.se> wrote: > > > I suspect the bug is in the memblockq code somewhere. It seems like it > > wedges itself somehow. The "missing" state variable is something I'm > > looking at with suspicion. > > > > This wasn't the problem. The trace revealed that we do not handle > changes to minreq properly. The scenario is: > > 1. Large latency, large buffer, large target fill, large minimum > request. Silence in queue (i.e. buffer is full). > > 2. Buffer drains slightly, making it fall below target fill. It is > however still below the minimum request, so nothing is sent to the > client. > > 3. The client requests a reduced latency, buffer is reduced, target > fill is reduced, minimum request is reduced. The buffer now greatly > exceeds target fill as it was almost up to the previous target fill > level. This means that the server will not be asking the client for > more data for a while. > > 4. Some time later we've drained most of the excess and are almost > back down to the target fill level. However the data requested in 2 is > sufficiently large that we never fall back down below target fill. > Hence we never start requesting for more data. And we already decided > in 2 not to send a request for the first portion. > > I've attached a patch that moves the minreq handling in to memblockq, > which should avoid this and any similar issues as there is now a single > place where things are throttled for minreq. > > This will affect all callers of pa_memblockq_pop_missing() and perturb > things a bit. I can't see it being fundamentally worse or better > though, just different. Besides fixing the bugs of course. Has this been tested with different prebuf values in buffer_attr? That seems to be one change that might affect other callers. Ideally we should also just add a test to the check-daemon suite to cover this and maybe a long-running case for the bug you found. > > Why do we even have that? Can't we derive that from the other > > variables? Having redundant state is just asking for things to get out > > of sync and for bugs to appear. > > > > Patch added for this as well. The memblockq code makes my head spin, > and killing of "missing" and santising "requested" makes it a bit > easier to comprehend. > > Please review as I'd really like this fix to start trickling out to the > distributions sooner rather than later. I've not been able to look through this, will try to do that a bit later. Cheers, Arun