On 02/14/2011 05:22 PM, Stefan Berger wrote:
On 02/14/2011 03:30 PM, Stefan Berger wrote:
On 02/14/2011 02:51 PM, Daniel P. Berrange wrote:
This approach feels like a nasty hack to me and potentially still
leaves
us with a problem in netcf which is also using netlink sockets. I think
we need to get a clearer picture of what the root cause is before going
for this kind of patch
Correct, I am 'fixing' this in the wrong place. The issues is in the
call sequence
nl_handle = nl_handle_alloc()
nl_connect(nl_handle, NETLINK_ROUTE)
with the second one failing taking merely input from the 1st one.
These are obviously two libnl calls. Something is either not using
libn or not using it correctly.
Thanks for pointing out netcf. I looked at libnetcf code and found
this sequence here:
[...]
int netlink_init(struct netcf *ncf) {
ncf->driver->nl_sock = nl_handle_alloc();
if (ncf->driver->nl_sock == NULL)
goto error;
if (nl_connect(ncf->driver->nl_sock, NETLINK_ROUTE) < 0) {
goto error;
}
This seems to be doing the same as I do. Maybe there is yet
'something else' that's using netlink sockets.
What's also strange is that the first 'virsh start' still works, but
the subsequent 'virsh destroy' then does not.
One definte problem in libnl is that the 'port allocation'
(generate_local_port()) is not thread-safe, even though I think it's
the library's responsibility to lock, not libvirt introducing a lock
that we need to grab before calling into netcf and grabbing in
macvtap. Unless libnl fixes this, I believe there will be no other way
than retrying. One will eventually bind and exclude a concurrent
thread from binding.
It's late but this doesn't look right even now in libnl
(libnl-debuginfo-1.1-12.fc14.x86_64):
port allocation (socket.c ; line 134):
used_ports_map[i] |= (1UL << n);
- that's going to set a bit
port deallocation (socket.c; line 156) :
used_ports_map[nr / 32] &= ~((nr % 32) + 1);
- that's going to produce garbage; no wonder things don't work
used_ports_map[nr / 32] &= ~(1 << (nr % 32));
or
used_ports_map[nr / 32] &= ~(1 << (nr & 0x1f));
- would probably be much better
Stefan
--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list