On 7/22/20 12:46 PM, Daniel P. Berrangé wrote:
On Wed, Jul 22, 2020 at 11:39:55AM +0200, Michal Privoznik wrote:
When building domain's private /dev in a namespace, libdevmapper
is consulted for getting full dependency tree of domain's disks.
The reason is that for a multipath devices all dependent devices
must be created in the namespace and allowed in CGroups.
However, this approach is very fragile as building of namespace
happens in the forked off child process, after mass close of FDs
and just before dropping privileges and execing QEMU. And it so
happens that when calling libdevmapper APIs, one of them opens
/dev/mapper/control and saves the FD into a global variable. The
FD is kept open until the lib is unlinked or dm_lib_release() is
called explicitly. We are doing neither.
This is not a problem when calling the function from libvirtd
(when setting up CGroups), but it is a problem when called from
the pre-exec hook because we leak the FD into QEMU.
Fixes: a30078cb832646177defd256e77c632905f1e6d0
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1858260
Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx>
---
src/util/virdevmapper.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c
index 40a82285f9..1c216fb6c1 100644
--- a/src/util/virdevmapper.c
+++ b/src/util/virdevmapper.c
@@ -156,6 +156,7 @@ virDevMapperGetTargetsImpl(const char *path,
virStringListFree(recursiveDevPaths);
virStringListFree(devPaths);
dm_task_destroy(dmt);
+ dm_lib_release();
return ret;
}
Hmm, this function isn't threadsafe though, so I'm kind of worried about
us breaking parallel callers.
For that matter dm_task_run isn't thread safe when it opens the
FD in the first place either AFAICT.
This libdevice-mapper.so looks like a bit of a disaster in general, as
I can't see mutexes acquired anywhere in the public APIs and it has a
load of static global variables including this FD :-(
We could put mutex locking around our own virDevMapperGetTargetsImpl
if we blindly assume nothing else we call may secretly be using
libdevice-mapper too.
Makes me wonder though if there's any way we can obtain the info we need
without touching libdevice-mapper.so at all.
This is sort of resolved in the rest of the series. If we drop this
patch then after patch 17 which moves populating /dev with domain disks
into libvirtd rather than forked child, we don't need to call
dm_lib_release(). But opening will still be unsafe, yes.
Well, looks like devmapper is doing a single ioctl (if I don't count
VERSION):
ioctl(3</dev/mapper/control>, DM_VERSION, {version=4.0.0,
data_size=16384, flags=DM_EXISTS_FLAG} => {version=4.42.0,
data_size=16384, flags=DM_EXISTS_FLAG}) = 0
ioctl(3</dev/mapper/control>, DM_TABLE_DEPS, {version=4.0.0,
data_size=16384, data_start=312, name="myusb", flags=DM_EXISTS_FLAG} =>
{version=4.42.0, data_size=328, data_start=312, dev=makedev(0xfd, 0x1),
name="myusb", uuid="CRYPT-LUKS1-12fa3b2b646d43eb82ffd8f671515f93-myusb",
target_count=1, open_count=0, event_nr=0,
flags=DM_EXISTS_FLAG|DM_ACTIVE_PRESENT_FLAG, ...}) = 0
so maybe we can use ioctl() ourselves and drop libdevmapper completely?
Michal