On 4/20/20 3:09 PM, Mark Asselstine wrote:
On Monday, April 20, 2020 6:11:25 A.M. EDT Michal Privoznik wrote:
On 4/16/20 5:57 PM, Mark Asselstine wrote:
The udev monitor thread "udevEventHandleThread()" will lag the
actual/real view of devices in sysfs as it serially processes udev
monitor events. So for instance if you were to run the following cmd
to create a new veth pair and rename one of the veth endpoints
you might see the following monitor events and real world that looks like
time
|
| create v0 sysfs entry
wake udevEventHandleThread | create v1 sysfs entry
udev_monitor_receive_device(v1-add) | move v0 sysfs to v2
udevHandleOneDevice(v1) |
udev_monitor_receive_device(v0-add) |
udevHandleOneDevice(v0) | <--- error msgs in
virNetDevGetLinkInfo() udev_monitor_receive_device(v2-move) | as v0
no longer exists udevHandleOneDevice(v2) |
\/
As you can see the changes in sysfs can take place well before we get
to act on the events in the udevEventHandleThread(), so by the time we
get around to processing the v0 add event, the sysfs entry has been
moved to v2.
To work around this we check if the sysfs entry is valid before
attempting to read it and don't bother trying to read link info if
not. This is safe since we will never read sysfs entries earlier than
it existing, ie. if the entry is not there it has either been removed
in the time since we enumerated the device or something bigger is
busted, in either case, no sysfs entry, no link info. In the case
described above we will eventually get the link info as we work
through the queue of monitor events and get to the 'move' event.
https://bugzilla.redhat.com/show_bug.cgi?id=1557902
Signed-off-by: Mark Asselstine <mark.asselstine@xxxxxxxxxxxxx>
---
src/util/virnetdev.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c
index b465bdac2e..bf256cbe35 100644
--- a/src/util/virnetdev.c
+++ b/src/util/virnetdev.c
@@ -2438,6 +2438,17 @@ virNetDevGetLinkInfo(const char *ifname,
if (virNetDevSysfsFile(&path, ifname, "operstate") < 0)
goto cleanup;
+ /* The device may have been removed or moved by the time we got here.
+ * Obviously attempting to get LinkInfo on a no longer existing
device
+ * is useless, so stop processing. If we got here via the udev
monitor
+ * a remove or move event will follow and we will be able to get
valid
+ * LinkInfo at that time */
+ if (!virFileExists(path)) {
+ VIR_WARN("The interface '%s' was removed before we could query
it.",
+ ifname);
This is not aligned.
Should I send a V2 or can this be correct when merged?
No need, I've fixed it (also changed goto cleanup to return, since I've
merged a patch that made 'cleanup' label go away this morning) and pushed.
+ goto cleanup;
+ }
But I have another idea to consider. We could return a different error
value here (say -2) and then have the caller decide what to do with it
(e.g. print the warning).
I had thought about propagating the error so the caller could possibly wrap
the warning message in more context but in the end I decided to call it out at
the issue site avoiding having each caller essentially reproducing essentially
the same code.
Yeah, what convinced me was that non-Linux version of this function
simply prints a debug message and returns 0. I haven't realized that
earlier, sorry.
Michal