[PATCH] protocol-native: Fix srbchannel double-freeing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2014-09-18 11:40, Tanu Kaskinen wrote:
> On Thu, 2014-09-18 at 09:21 +0200, David Henningsson wrote:
>>
>> On 2014-09-18 09:08, Tanu Kaskinen wrote:
>>> On Tue, 2014-09-16 at 16:41 +0200, David Henningsson wrote:
>>>>
>>>> On 2014-09-16 13:22, Tanu Kaskinen wrote:
>>>>> On Tue, 2014-09-16 at 08:34 +0200, David Henningsson wrote:
>>>>>>
>>>>>> On 2014-09-14 21:06, Tanu Kaskinen wrote:
>>>>>>> I was able to trigger a crash with some unfinished code that caused
>>>>>>> a protocol error. The error was handled while
>>>>>>> pa_pstream_set_srbchannel() had not yet finished, so
>>>>>>> pa_native_connection.srbpending and pa_pstream.srbpending were both
>>>>>>> non-NULL. During the error handling, native_connection_unlink() freed
>>>>>>> the srbchannel, and then pa_pstream_unlink() tried to free it for the
>>>>>>> second time, causing an assertion error in mainloop_io_free().
>>>>>>
>>>>>> Ok, thanks for finding!
>>>>>>
>>>>>>> ---
>>>>>>>      src/pulsecore/protocol-native.c | 14 +++++++++++++-
>>>>>>>      1 file changed, 13 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/src/pulsecore/protocol-native.c b/src/pulsecore/protocol-native.c
>>>>>>> index 6ec65d6..012c41a 100644
>>>>>>> --- a/src/pulsecore/protocol-native.c
>>>>>>> +++ b/src/pulsecore/protocol-native.c
>>>>>>> @@ -2639,13 +2639,25 @@ static void setup_srbchannel(pa_native_connection *c) {
>>>>>>>
>>>>>>>      static void command_enable_srbchannel(pa_pdispatch *pd, uint32_t command, uint32_t tag, pa_tagstruct *t, void *userdata) {
>>>>>>>          pa_native_connection *c = PA_NATIVE_CONNECTION(userdata);
>>>>>>> +    pa_srbchannel *tmp;
>>>>>>>
>>>>>>>          if (tag != (uint32_t) (size_t) c->srbpending)
>>>>>>>              protocol_error(c);
>>>>>>>
>>>>>>>          pa_log_debug("Client enabled srbchannel.");
>>>>>>> -    pa_pstream_set_srbchannel(c->pstream, c->srbpending);
>>>>>>> +
>>>>>>> +    /* We must set c->sbpending to NULL before calling
>>>>>>> +     * pa_pstream_set_srbchannel(), because otherwise there is a time window
>>>>>>> +     * when both pa_native_connection and pa_pstream think they own the
>>>>>>> +     * srbchannel. pa_pstream_set_srbchannel() may read and dispatch arbitrary
>>>>>>> +     * commands from the client, and if a protocol error occurs, the connection
>>>>>>> +     * is torn down, and if c->sbpending has not yet been set to NULL here, the
>>>>>>> +     * srbchannel will be double-freed. XXX: Can we somehow avoid reading and
>>>>>>> +     * dispatching commands in pa_pstream_set_srbchannel()? Processing commands
>>>>>>> +     * recursively seems prone to bugs. */
>>>>>>> +    tmp = c->srbpending;
>>>>>>>          c->srbpending = NULL;
>>>>>>> +    pa_pstream_set_srbchannel(c->pstream, c->srbpending);
>>>>>>
>>>>>> You probably mean "tmp" here instead of "c->srbpending"...
>>>>>
>>>>> *facepalm*
>>>>>
>>>>>> Anyhow, I think it would be better to try to resolve your concern by
>>>>>> implementing a deferred event. Because, as you say, processing commands
>>>>>> recursive seems prone to bugs. I e, pa_srbchannel_set_callback would not
>>>>>> call srbchannel_rwloop, but instead set up a deferred event which would
>>>>>> just call srbchannel_rwloop.
>>>>>>
>>>>>> Does that make sense?
>>>>>
>>>>> Hard to say, because it's not clear to me why srbchannel_rwloop() is
>>>>> called in the first place. If it's moved to a defer event, there will be
>>>>> "stuff" happening between setting the callback and dispatching the defer
>>>>> event. It's not clear to me what that "stuff" can include. Can it
>>>>> include something that should not be done before srbchannel_rwloop() has
>>>>> been called?
>>>>
>>>> srbchannel_rwloop is called to handle this potential scenario:
>>>>
>>>>     1) peer A sends srbchannel switch command to peer B
>>>>     2) peer A writes something to the srbchannel
>>>>     3) peer B receives srbchannel command and sets up the srbchannel
>>>>     4) If srbchannel_rwloop is not called by peer B at this point, the
>>>> message written in 2) will not be picked up by peer B in a timely fashion.
>>>
>>> Why is it not picked up in a timely fashion? You have an fd that will
>>> normally signal POLLIN when there is data to be read. Is there some
>>> reason that doesn't work if there is data to be read already at
>>> initialization time?
>>
>> As an optimisation, the pa_fdsem layer makes sure that nothing is
>> written to the fd, if nothing is waiting on the other side. And the io
>> event waits for things to be written to the fd.
>
> Ok. I think the patch is good. It would be nice to have a comment
> explaining what you explained above, though.

Thanks, pushed now (after adding a comment).

-- 
David Henningsson, Canonical Ltd.
https://launchpad.net/~diwic


[Index of Archives]     [Linux Audio Users]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux