Re: [PATCH RFC RFC] rds: Use NETDEV_UNREGISTER in rds_tcp_dev_event() (then kill NETDEV_UNREGISTER_FINAL)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 16.03.2018 20:29, Sowmini Varadhan wrote:
> 
> I had taken some of this offline, but it occurs to me
> that some of these notes should be saved to the netdev archives, 
> in case this question pops up again in a few years.
> 
> When I run your patch, I get a repeatable panic by doing
>   modprobe rds-tcp
>   ip netns create blue
> the panic is because we are finding a null trn in rds_tcp_init_net.

I did the second version and sent you. Have you tried it?

> I think there's something very disturbed about calling
> register_pernet_operations() twice, once via 
> register_pernet_device() and again via register_pernet_subsys().

Calling netdevice handler for every event is more disturbing,
as number of events is several times bigger, than one more
pernet exit method.

> I suspect this has everything to do with the panic but I have
> not had time to debug every little detail here.

You speak in the way I forced you to spend a lot of time debugging
my patches. But I sent you only one with a warn it's not tested.
(but twice -- since the second time I used --patience diff option
to make idea more visible for you). It's strange to hear this from
you.

> In general, rds_tcp is not a network device, it is a kernel
> module.  That is the fundamental problem here. 
> 
> To repeat the comments form net_namespace.h:
>  * Network interfaces need to be removed from a dying netns _before_
>  * subsys notifiers can be called, as most of the network code cleanup
>  * (which is done from subsys notifiers) runs with the assumption that
>  * dev_remove_pack has been called so no new packets will arrive during
>  * and after the cleanup functions have been called.  dev_remove_pack
>  * is not per namespace so instead the guarantee of no more packets
>  * arriving in a network namespace is provided by ensuring that all
>  * network devices and all sockets have left the network namespace
>  * before the cleanup methods are called.
> 
> when the "blue" netns starts up, it creates at least one kernel listen
> socket on *.16385. This socket, and any other child/client sockets 
> created must be cleaned up before the cleanup_net can happen.
> 
> This is why I chose to call regster_pernet_subsys. Again, as per
> comments in net_namespace.h:
> 
>  * Use these carefully.  If you implement a network device and it
>  * needs per network namespace operations use device pernet operations,
>  * otherwise use pernet subsys operations.

This is not a strict rule, and currently I'm working on pernet_operations
synchronization. This commentary is for generic code, while your rds
is differ from the rest of ordinary pernet_operations.

> On (03/16/18 18:51), Kirill Tkhai wrote:
>>> Let's find another approach. Could you tell what problem we have in 
>>> case of rds_tcp_dev_ops is declared as pernet_device?
> 
> As above, rds-tcp is not a network device.

It's not an agrument. It's not a rule, what you read in the comment.
We need to use rules of reasonableness, and the generic rules do not
fit to your driver. So, it's need to search a solution. Be the only
user NETDEV_UNREGISTER_FINAL in the kernel does not look a good one.

>> One more question. Which time we take a reference of loopback device?
>> Is it possible before we created a net completely?
> 
> We dont take a reference on the loopback device. 
> We make sure none of the kernel sockets does a get_net() so
> that we dont block the cleanup_net, and then, when all
> the network interfaces have been taken down (loopback is
> the last one) we know there are no more packets coming in
> and out, so it is safe to dismantle all kernel sockets 
> created by rds-tcp.

So that, if RDS packets does not act on netdev refcount,
which circular dependence did you mean in your second
message (quote below)?

>As I recall, when I wrote the initial patchset, my problem
>was that in order to let the module unload make progress,
>all these sockets had to be cleaned up. But to clean up these
>sockets, net_device cleanup had to complete (should not have
>any new incoming connections to the listen endpoint on a 
>non-loopback socket) so I ended up with a circular dependancy.

Kirill
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux