Re: [PATCH net] sctp: fix race on protocol/netns initialization

Vlad Yasevich <vyasevich@xxxxxxxxx> · Thu, 10 Sep 2015 15:14:41 -0400

On 09/10/2015 02:35 PM, Marcelo Ricardo Leitner wrote:
> On Thu, Sep 10, 2015 at 01:24:54PM -0300, Marcelo Ricardo Leitner wrote:
>> On Thu, Sep 10, 2015 at 11:50:06AM -0400, Vlad Yasevich wrote:
>>> On 09/10/2015 10:22 AM, Marcelo Ricardo Leitner wrote:
>>>> Em 10-09-2015 10:24, Vlad Yasevich escreveu:
>> ...
>>>>> Then you can order sctp_net_init() such that it happens first, then protosw registration
>>>>> happens, then control socket initialization happens, then inet protocol registration
>>>>> happens.
>>>>>
>>>>> This way, we are always guaranteed that by the time user calls socket(), protocol
>>>>> defaults are fully initialized.
>>>>
>>>> Okay, that works for module loading stage, but then how would we handle new netns's? We
>>>> have to create the control socket per netns and AFAICT sctp_net_init() is the only hook
>>>> called when a new netns is being created.
>>>>
>>>> Then if we move it a workqueue that is scheduled by sctp_net_init(), we loose the ability
>>>> to handle its errors by propagating through sctp_net_init() return value, not good.
>>>
>>> Here is kind of what I had in mind.  It's incomplete and completely untested (not even
>>> compiled), but good enough to describe the idea:
>> ...
>>
>> Ohh, ok now I get it, thanks. If having two pernet_subsys for a given
>> module is fine, that works for me. It's clearer and has no moment of
>> temporary failure.
>>
>> I can finish this patch if everybody agrees with it.
>>
>>>>>> I used the list pointer because that's null as that memory is entirely zeroed when alloced
>>>>>> and, after initialization, it's never null again. Works like a lock/condition without
>>>>>> using an extra field.
>>>>>>
>>>>>
>>>>> I understand this a well.  What I don't particularly like is that we are re-using
>>>>> a list without really stating why it's now done this way.  Additionally, it's not really
>>>>> the last that happens so it's seems kind of hacky...  If we need to add new
>>>>> per-net initializers, we now need to make sure that the code is put in the right
>>>>> place.  I'd just really like to have a cleaner solution...
>>>>
>>>> Ok, got you. We could add a dedicated flag/bit for that then, if reusing the list is not
>>>> clear enough. Or, as we are discussing on the other part of thread, we could make it block
>>>> and wait for the initialization, probably using some wait_queue. I'm still thinking on
>>>> something this way, likely something more below than sctp then.
>>>>
>>>
>>> I think if we don the above, the second process calling socket() will either find the
>>> the protosw or will try to load the module also.  I think either is ok after
>>> request_module returns we'll look at the protosw and will find find things.
>>
>> Seems so, yes. Nice.
> 
> I was testing with it, something is not good. I finished your patch and
> testing with a flooder like:
>  # for j in {1..5}; do for i in {1234..1280}; do \
>        sctp_darn -H 192.168.122.147 -P $j$i -l & done & done
> 
... snip...
> 
> It seems that request_module will not serialize it as we wanted and we
> would be putting unexpected pressure on it, yet it fixes the original
> issue.

So, wouldn't the same issue exist when running the above with DCCP sockets?

> Maybe we can place a semaphore at inet_create(), protecting the
> request_module()s so only one socket can do it at a time and, after it
> is released, whoever was blocked on it re-checks if the module isn't
> already loaded before attempting again. It makes the loading of
> different modules slower, though, but I'm not sure if that's really a
> problem. Not many modules are loaded at the same time like that. What do
> you think? 

I think this is a different issue.  The fact that we keep trying to probe
the same module is silly.  May be a per proto semaphore so that SCTP doesn't
block DCCP for example.

-vlad

> 
>   Marcelo
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html