On 02/25/2016 10:25 AM, Cole Robinson wrote: > On 02/24/2016 09:31 AM, Pavel Hrdina wrote: >> Commit 35847860 introduced virFileUnlink() to fix an issue with deleting >> volumes on NFS root-squashed environment. This patch replace the uid >> and gid magic by virFileIsSharedFSType() to correctly detect that the >> volume is on NFS storage. >> >> This fixes the referenced bug. To reproduce this bug follow those >> steps on a domain with local storage: >> >> virsh start $domain >> virsh pool-refresh $pool >> virsh destroy $domain >> virsh vol-delete $volume $pool >> >> The thing is, that the pool-refresh will store qemu:qemu as uid:gid for >> that volume and after destroy the volume is relabeled back to root:root. >> Then you run vol-delete and the virFileRemove() function will try to >> unlink the file as a qemu:qemu process based on the uid and gid magic. >> >> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1260356 >> > > Good catch! Sounds like this fedora bug: > > https://bugzilla.redhat.com/show_bug.cgi?id=1289327 > > And this RHEL bug that jferlan has been looking at: > > https://bugzilla.redhat.com/show_bug.cgi?id=1293804 > >> Signed-off-by: Pavel Hrdina <phrdina@xxxxxxxxxx> >> --- >> src/util/virfile.c | 9 ++++++--- >> 1 file changed, 6 insertions(+), 3 deletions(-) >> >> diff --git a/src/util/virfile.c b/src/util/virfile.c >> index f45e18f..f9c5bb1 100644 >> --- a/src/util/virfile.c >> +++ b/src/util/virfile.c >> @@ -2334,13 +2334,16 @@ virFileRemove(const char *path, >> int status = 0, ret = 0; >> gid_t *groups; >> int ngroups; >> + int rc; >> >> /* If not running as root or if a non explicit uid/gid was being used for >> * the file/volume or the explicit uid/gid matches, then use unlink directly >> */ The comment would need an update... But understanding the history may help you make the right choice for a solution. I wouldn't doubt things are related somehow, but I wonder if that pool-refresh does something "unexpected" - that may be the key. >> - if ((geteuid() != 0) || >> - ((uid == (uid_t) -1) && (gid == (gid_t) -1)) || >> - (uid == geteuid() && gid == getegid())) { >> + rc = virFileIsSharedFSType(path, VIR_FILE_SHFS_NFS); >> + if (rc < 0) >> + return -EINVAL; >> + >> + if (rc == 0) { >> if (virFileIsDir(path)) >> return rmdir(path); >> else >> > > The original code is trying to check for two cases: > > - The daemon is unprivileged. In that case we don't want to try any of the > setuid stuff because we know it's not going to work. > - The file is owned by someone other than the daemon's uid/gid. If so, we want > to do the setuid stuff to try and make NFS root squash work. > > This change discards the first check, which we want to preserve. > > Regarding the second check, is NFS the only reason it exists? Maybe John or > Laine know more. > It's a sordid history... Last one to make a change will own it, tag you're it ;-) Start at commit id 'c6b32d68' and follow into the original commit '35847860'. There were 2 other related commits '691dd388' and 'db9277a39'... All found when testing using the NFS root squash environment. As I recall part of the issue points back to commit id '7c2d65dde' which altered things to handle "non default" protection and mode bits. So why the uid/gid checking - well it's a monkey see, monkey do reaction (see virDirCreate and virFileOpenAs). When it was forefront in my mind - I thought I had worked out all the details. I wouldn't be surprised if there's a nuance I missed. It's all the little details that will keep causing issues... FWIW: The deletion code led me into a different maze of twisty little passages dealing with error paths and trying to properly delete the volume if some "failure" occurs after successfully creating the volume. There's a 7 patch series starting at commit id '9cb36def8' and a 6 patch series start at commit id '77346f27'. There's also commits 'ce6506b0' (a forceful way to set the mode bits) and '5e3ad0b7' (use virFileOpenAs for virStorageBackendVolOpen opens). So given all that I wonder if the pool-refresh might be the "culprit". In the text of the case I have, it's mentioned that vol-dumpxml seems clear things such that a subsequent delete works. Both would have the "need" to "open" the volume and in the refresh case, the code adjusts permissions and mode! Note it's not virFileOpenAs... I think what I would find interesting is knowing whether the sequence of commands above was done as root or as a user. Or trying them as both, just to be sure. Then between each sequence, check the permissions and mode of the files directly (it'd be nice to know the before create and after create as well). It may also be worthwhile to know the pool permissions and mode values as well as if the volume creation provided XML that had specific permission/mode values set. Finally, what I started wondering with the bz I have assigned to me is that at least one person reports running under non-root, creating a volume in what ends up being the default pool directory (/var/lib/libvirt/images) and having a problem deleting the volume. In order to do this they have set up permission in the libvirt group ("which is some magic provided by the Debian package"). That got me to thinking - what "right" does user X have to create a file in the default pool? Well the magic of being in the libvirt group seems to allow it (if you've used Polkit to do that)... Then which uid:gid and mode are used in order to create that file. It wasn't clear to me whether any of the tools were providing the uid, gid, & mode in the XML. I've just been deep into other code and figured the bz I have would pop to the top of the work stack eventually. John -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list