Making module-zeroconf-publish non-blocking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2013-05-15 at 14:06 +0300, Tanu Kaskinen wrote:
> On Wed, 2013-05-15 at 16:02 +0530, Arun Raghavan wrote:
> > On Wed, 2013-05-15 at 13:05 +0300, Tanu Kaskinen wrote:
> > > On Wed, 2013-05-15 at 10:13 +0530, Arun Raghavan wrote:
> > > > Hello,
> > > > This one is really intrusive (at least to m-z-publish), but deals with a
> > > > long-standing bug which is also a 4.0 blocker[1] where PulseAudio sometimes 
> > > > takes 20s for daemon startup to complete.
> > > 
> > > There's no evidence that module-zeroconf-publish has anything to do with
> > > 58758.
> > 
> > I can reproduce intermittent ~20s startup delays that are clearly in
> > module-zeroconf-publish, which is why I introduced this as a fix for it.
> 
> Sure, but the logs in 58758 do not have any trace of
> module-zeroconf-publish. That makes me pretty sure that the bug that
> you're seeing and 58758 are different bugs.

That's true. :/

> > It's easy enough to reproduce, just run this in a few times and see when
> > m-z-publish gets loaded. avahi-daemon needs to not be running.
> > 
> >     pulseaudio -vvvv 2>&1 | grep zeroconf
> > 
> > > If the 20s delay is caused by some D-Bus message timing out, have you
> > > investigated what message is timing out and why? I don't think it's
> > 
> > It's a ping message to the Avahi daemon (which is not running), and the
> > timeout takes this long. The relevant code is:
> > 
> > http://git.0pointer.de/?p=avahi.git;a=blob;f=avahi-client/client.c#l564
> > 
> > > expected behaviour, so fixing this in PulseAudio is working around a bug
> > > elsewhere. I support any patches that remove blocking IO from the main
> > 
> > The default timeout is 25s. So a long block is not unexpected.
> 
> I don't think you can draw such conclusion. You could say "so a long
> block is not *necessarily* unexpected". A ping operation should not take
> multiple seconds to process, but it may of course be that Avahi is busy
> with other stuff. If the design of Avahi is such that periods of
> unresponsiveness may happen during normal operation, then a long block
> during pinging can be sort of expected.

In general, I don't think we can make any bets about how long a D-Bus
service will take to reply. So not making blocking D-Bus calls in the
mainloop thread does make sense.

Specifically, as you noted, the delay is because of activation, not
because Avahi is taking time to respond.

> > However,
> > I don't see the timeout take this long every time. I'm trying to find
> > out why this is.

Okay, so I know the source of the problem now. avahi_client_new() runs a
Ping() on the Avahi D-Bus service, expecting that to cause it to
autospawn if it isn't running. Avahi autospawn is currently broken due
to it's activation file not having a User=? line.

The D-Bus activation logic marks an activation pending, then tries to
activate, fails because of this, and immediately returns. If the same
call is made again soon thereafter, it will block (since it thinks there
is an activation pending) until the pending activation times out (25s
from when the first activation was triggered).

So that explains all the weird behaviour there.

> > > thread, though, so I'm fine with having the fix anyway, but pushing it
> > > to master at this point doesn't seem justified (not a regression, and
> > > the bug impact seems very limited).
> > 
> > This bug's been open downstream for 6 months now, and is biting a number
> > of users, so I'd like to address fixing it (which is also why I'd marked
> > it as a 4.0 blocker in the first place).
> 
> What bug has been open for 6 months? If you mean 58758, which has been

Downstream. https://bugs.gentoo.org/show_bug.cgi?id=441624

> open for a bit less than 5 months, I seriously doubt that fixing
> module-zeroconf-publish will get rid of the delays that the reporter is
> seeing. Or more accurately, the delays that Matthew Cope is seeing
> (there are several people commenting on the bug, but Matthew is the only
> one who has provided logs).

Since the issue is really down to the avahi activation file. I agree.

-- Arun



[Index of Archives]     [Linux Audio Users]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux