On Nov 3, 2014, at 9:44 AM, Benjamin Coddington <bcodding@xxxxxxxxxx> wrote: > > > On Thu, 30 Oct 2014, Chuck Lever wrote: > >> >> On Oct 30, 2014, at 12:52 PM, Benjamin Coddington <bcodding@xxxxxxxxxx> wrote: >> >>> >>> >>> On Thu, 30 Oct 2014, Chuck Lever wrote: >>> >>>> >>>> On Oct 30, 2014, at 12:08 PM, Benjamin Coddington <bcodding@xxxxxxxxxx> wrote: >>>> >>>>> >>>>> >>>>> On Thu, 30 Oct 2014, Chuck Lever wrote: >>>>> >>>>>> >>>>>> On Oct 30, 2014, at 10:53 AM, Benjamin Coddington <bcodding@xxxxxxxxxx> wrote: >>>>>> >>>>>>> >>>>>>> On Wed, 29 Oct 2014, Chuck Lever wrote: >>>>>>> >>>>>>>> Hi Ben- >>>>>>>> >>>>>>>> On Oct 29, 2014, at 7:27 PM, Benjamin Coddington <bcodding@xxxxxxxxxx> wrote: >>>>>>>> >>>>>>>>> Hi Chuck, I'll jump in here if you don't mind. >>>>>>>>> >>>>>>>>> How's this work for missing keyctl_invalidate: >>>>>>>>> >>>>>>>>> diff --git a/configure.ac b/configure.ac >>>>>>>>> index 59fd14d..8295bed 100644 >>>>>>>>> --- a/configure.ac >>>>>>>>> +++ b/configure.ac >>>>>>>>> @@ -270,6 +270,9 @@ AC_CHECK_LIB([crypt], [crypt], [LIBCRYPT="-lcrypt"]) >>>>>>>>> >>>>>>>>> AC_CHECK_LIB([dl], [dlclose], [LIBDL="-ldl"]) >>>>>>>>> >>>>>>>>> +AC_CHECK_LIB([keyutils], [keyctl_invalidate], ,[ >>>>>>>>> + AC_DEFINE([MISSING_KEYCTL_INVALIDATE], [1], [Define to use >>>>>>>>> keyctl_revoke instead])]) >>>>>>>> >>>>>>>> Nit: I would just add >>>>>>>> >>>>>>>> AC_CHECK_FUNCS([keyctl_invalidate]) >>>>>>>> >>>>>>>> in aclocal/keyutils.m4 to define HAVE_KEYCTL_INVALIDATE . >>>>>>> >>>>>>> Yes, that is better. >>>>>>> >>>>>>>>> + >>>>>>>>> if test "$enable_nfsv4" = yes; then >>>>>>>>> dnl check for libevent libraries and headers >>>>>>>>> AC_LIBEVENT >>>>>>>>> diff --git a/utils/nfsidmap/nfsidmap.c b/utils/nfsidmap/nfsidmap.c >>>>>>>>> index e0d31e7..ab4b10c 100644 >>>>>>>>> --- a/utils/nfsidmap/nfsidmap.c >>>>>>>>> +++ b/utils/nfsidmap/nfsidmap.c >>>>>>>>> @@ -14,6 +14,7 @@ >>>>>>>>> #include <unistd.h> >>>>>>>>> #include "xlog.h" >>>>>>>>> #include "conffile.h" >>>>>>>>> +#include “config.h" >>>>>>>>> >>>>>>>>> int verbose = 0; >>>>>>>>> char *usage="Usage: %s [-v] [-c || [-u|-g|-r key] || [-t timeout] key >>>>>>>>> desc]"; >>>>>>>>> @@ -23,6 +24,10 @@ char *usage="Usage: %s [-v] [-c || [-u|-g|-r key] || >>>>>>>>> [-t timeout] key desc]"; >>>>>>>>> #define USER 1 >>>>>>>>> #define GROUP 0 >>>>>>>>> >>>>>>>>> +#ifdef MISSING_KEYCTL_INVALIDATE >>>>>>>>> +#define keyctl_invalidate(key) keyctl_revoke(key) >>>>>>>>> +#endif >>>>>>>>> + >>>>>>>>> #define PROCKEYS "/proc/keys" >>>>>>>>> #ifndef DEFAULT_KEYRING >>>>>>>>> #define DEFAULT_KEYRING "id_resolver" >>>>>>>>> >>>>>>>>> ^^^ that's a little ugly -- it doesn't try to figure out what should be >>>>>>>>> done in the kernel to clean up keys. It assumes that if your >>>>>>>>> libkeyutils has keyctl_invalidate then that's what you should use. >>>>>>>> >>>>>>>> This looks like it fixes the build issue. I think we do >>>>>>>> want late-model nfs-utils to build correctly on older >>>>>>>> distributions. >>>>>>>> >>>>>>>> I’m not sure keyctl_revoke and keyctl_invalidate do >>>>>>>> precisely the same thing, though? On older systems can >>>>>>>> we expect a change from one to the other to have no >>>>>>>> impact? (Just beginning to explore this issue). >>>>>>> >>>>>>> For EL6 kernels, you should be good with keyctl_revoke. That's the only >>>>>>> thing you can do - there's no key_invalidate. >>>>>>> >>>>>>> But on later kernels, you'd want to use key_invalidate. >>>>>> >>>>>> I realize that EL6 user space is not designed to support >>>>>> newer kernels, but some distributions allow continuous >>>>>> upgrades of kernels. If the kernel API changes over time, >>>>>> then IMO user space tools need to be sensitive to what >>>>>> kernel is running. >>>>> >>>>> It would be a lot of work to continually backport adjustments to >>>>> utilities across the supported/released platforms to allow >>>>> compatilibilty with upstream kernels; it also reduces the stability >>>>> of those releases. >>>>> >>>>> It would be nice if it always just worked, but /most/ RHEL customers >>>>> don't try to run upstream kernels in older releases. >>>> >>>> Just an example: >>>> >>>> Oracle Linux provides updated kernels via the Unbreakable >>>> Enterprise Kernel releases. The latest release is UEK3, which >>>> is 3.8-based. It installs on EL6. >>>> >>>> My point of posting here, just to be clear, is that upstream >>>> nfs-utils no longer builds on systems that have an older >>>> keyutils. The details particular to EL6 can be resolved, as >>>> Steve suggested, in an RH bz. >>>> >>>> In the nfsidmap case, I think the extra logic in nfsidmap to >>>> do the right keyctl call is simple to add and test. That would >>>> make nfsidmap “just work”. >>>> >>>>>>> The details of the kernel changes are here: >>>>>>> >>>>>>> 0c7774abb41bd00d KEYS: Allow special keys (eg. DNS results) to be >>>>>>> invalidated by CAP_SYS_ADMIN >>>>>> >>>>>> I think this means the EL6 nfsidmap no longer works quite >>>>>> right when running 3.17. I’m still studying the problem. >>>>>> See below. >>>>>> >>>>>>> The summary is that permission changes in later kernels cause >>>>>>> keyctl_revoke to be unable to clean up keys that are not in possession. >>>>>>> This specific commit allows that once more for CAP_SYS_ADMIN, so >>>>>>> really, it should work fine if you have this. However: >>>>>>> >>>>>>> keyctl_revoke waits key_gc_timeout to clean up the key, and access >>>>>>> attempts return -EKEYREVOKED. >>>>>>> >>>>>>> keyctl_invalidate immediately removes all references to the key. >>>>>> >>>>>> This change means keyctl_set_timeout fails, since >>>>>> lookup_user_key returns -EKEYREVOKED, for example, when a >>>>>> key is revoked instead of invalidated. The key timeouts >>>>>> are then set to 0 (the default). >>>>>> >>>>>> There is at least one other bug which breaks nfsidmap in >>>>>> 3.13 and newer kernels. I will post a proposed fix later >>>>>> today. >>>>>> >>>>>>> The latter is the preferred operation for nfsidmap, since this code path >>>>>>> exists to allow the admin to flush out a specific key from the idmapper >>>>>>> cache. >>>>>> >>>>>> EL6 libkeyutils doesn’t have keyctl_invalidate. That >>>>>> seems to be the crux of the problem (for EL6). >>>>>> >>>>>>> It might be a good idea to just update your libkeyutils along with the kernel >>>>>>> and nfs-utils. Maybe we should make a version dependency for >>>>>>> libkeyutils in nfs-utils. Steve, what do you think? >>>>>> >>>>>> I don’t know the history of the kernel API, but one >>>>>> assumes that 2.6.32-vintage kernels don’t have >>>>>> keyctl_invalidate, since it is missing from older >>>>>> libkeyutils as well. >>>>>> >>>>>> I think nfs-utils needs both to build with >>>>>> keyctl_invalidate support if that exists on the build >>>>>> system, and it needs to pick which of keyctl_revoke >>>>>> or keyctl_invalidate it will invoke based on the kernel >>>>>> version where it’s running. That’s pretty easy to do >>>>>> in nfs-utils. >>>>>> >>>>>> Is keyctl_revoke expected to go away at some point? >>>>> >>>>> I think that it serves an important role in marking keys as existing, >>>>> but revoked - this can provide a useful type of negative cache to >>>>> communicate the state of an object. I haven't expected it to go away. >>>>> >>>>>>>>> EL6 systems should be able to do both the request-key (nfsidmap) >>>>>>>>> and the rpc.idmapd upcall. I believe that EL6 kernels try both - if the >>>>>>>>> nfsidmap request-key doesn't work they fall back to the upcall, however >>>>>>>>> the nfsidmap request-key interface really is the one that should be >>>>>>>>> used. >>>>>>>> >>>>>>>> I have several EL6 systems here, and at least one of them >>>>>>>> had rpc.idmapd configured off. I couldn’t remember if I had >>>>>>>> done that, or it came that way off the installation media. >>>>>>> >>>>>>> I think rpc.idmapd being on/off changed a couple of times in EL6.. I >>>>>>> don't recall the specifics. >>>>>> >>>>>> Makes sense. My EL6 installs are of various vintages. >>>>>> >>>>>> But that could be a problem when installing a kernel that >>>>>> causes nfsidmap to fail because the kernel API has changed. >>>>>> Without the fallback in place, ID mapping will not work. >>>>> >>>>> Ah, but those later kernels will not try the fallback. :/ Or, maybe >>>>> there is a set of kernels that are broken that will try the fallback, >>>>> but later ones won't. >>>>> >>>>> I used to do this when using later kernels with EL6: if it didn't >>>>> work with EL6 userspace then use upstream nfs-utils, keylibs... etc. As >>>>> long as you didn't get into dep-hell, it seemed the simplest path to >>>>> getting a working system. >>>> >>>> Except that EL6 libkeyutil doesn’t have keyctl_invalidate. So >>>> there’s no way to build a working nfsidmap without installing >>>> a newer keyutils. That seems like a step along the path to >>>> dep-hell that could be prevented with a few careful lines of >>>> code in nfs-utils. >>>> >>>> I’d like to be able to pull an upstream nfs-utils and build it >>>> on EL6, at the very least. >>> >>> Yes, I agree. It occurs to me that you can also call these through the >>> syscall keyctl(), and pass the function number - so we can bypass a >>> non-compatible libkeyutils with something like (the untested): >>> >>> diff --git a/utils/nfsidmap/nfsidmap.c b/utils/nfsidmap/nfsidmap.c >>> index e0d31e7..99ae07e 100644 >>> --- a/utils/nfsidmap/nfsidmap.c >>> +++ b/utils/nfsidmap/nfsidmap.c >>> @@ -209,10 +209,17 @@ static int key_invalidate(char *keystr, int keymask) >>> *(strchr(buf, ' ')) = '\0'; >>> sscanf(buf, "%x", &key); >>> >>> - if (keyctl_invalidate(key) < 0) { >>> - xlog_err("keyctl_invalidate(0x%x) failed: %m", key); >>> - fclose(fp); >>> - return 1; >>> +/* older libkeyutils compatibility */ >>> +#ifndef KEYCTL_INVALIDATE >>> +#define KEYCTL_INVALIDATE 21 /* invalidate a key */ >>> +#endif >>> + if (keyctl(KEYCTL_INVALIDATE, key) < 0 && errno == EOPNOTSUPP) { >>> + /* older kernel compatibility attempt: */ >>> + if (keyctl_revoke(key) < 0) { >>> + xlog_err("keyctl_invalidate(0x%x) failed: %m", key); >>> + fclose(fp); >>> + return 1; >>> + } >>> } >>> >>> keymask &= ~mask; >>> >>> This should try to do the keyctl_invalidate if the kernel has it instead >>> of relying on the stub in libkeyutils. >> >> I tested this with upstream 3.17, 2.6.39-400.209.1.el6uek.x86_64 (UEK2), >> and 2.6.32-504.el6.x86_64. I think this approach can work. >> >> Upstream 3.17 worked as expected. >> >> UEK2 seems to use only the rpc.idmapd interface, no keys were created, >> and the test workload ran normally. >> >> 2.6.32-504.el6.x86_64 almost worked. >> >> Oct 30 13:01:58 dali nfsidmap_new[2321]: key: 0x249ea9d9 type: uid value: cel@xxxxxxxxxx timeout 600 >> Oct 30 13:01:58 dali nfsidmap_new[2321]: nfs4_name_to_uid: calling nsswitch->name_to_uid >> Oct 30 13:01:58 dali nfsidmap_new[2321]: nss_getpwnam: name 'cel@xxxxxxxxxx' domain 'oracle.com': resulting localname 'cel' >> Oct 30 13:01:58 dali nfsidmap_new[2321]: nfs4_name_to_uid: nsswitch->name_to_uid returned 0 >> Oct 30 13:01:58 dali nfsidmap_new[2321]: nfs4_name_to_uid: final return value is 0 >> Oct 30 13:01:58 dali nfsidmap_new[2323]: key: 0x2944b451 type: gid value: users@xxxxxxxxxx timeout 600 >> Oct 30 13:01:58 dali nfsidmap_new[2323]: nfs4_name_to_gid: calling nsswitch->name_to_gid >> Oct 30 13:01:58 dali nfsidmap_new[2323]: nfs4_name_to_gid: nsswitch->name_to_gid returned 0 >> Oct 30 13:01:58 dali nfsidmap_new[2323]: nfs4_name_to_gid: final return value is 0 >> >> Golden. But nfsidmap_new was not able to set the key timeouts: >> >> [root@dali ~]# cat /proc/keys >> 020d3315 I--Q-- 3 perm 1f3f0000 0 -1 keyring _uid.0: empty >> 0bf90e2d I--Q-- 5 perm 1f3f0000 0 0 keyring _ses: 1/4 >> 1a94e9ce I--Q-- 1 perm 1f3f0000 0 -1 keyring _uid_ses.0: 1/4 >> 1f77c0ad I--Q-- 1 perm 3f050000 0 0 id_resolv gid:root@xxxxxxxxxx: 2 >> 249ea9d9 I--Q-- 1 perm 3f050000 0 0 id_resolv uid:cel@xxxxxxxxxx: 5 >> 2944b451 I--Q-- 1 perm 3f050000 0 0 id_resolv gid:users@xxxxxxxxxx: 4 >> 3641d485 I----- 1 perm 1f030000 0 0 keyring .id_resolver: 4/4 >> 3b10283e I--Q-- 1 perm 3f050000 0 0 id_resolv uid:root@xxxxxxxxxx: 2 >> >> I’m not sure if that’s normal for EL6 kernels, since I haven’t >> used one of the stock EL6 kernels in a while. > > It looks like this problem is unrelated to the above patch and exists > upstream as well. Indeed it does. AFAICT keyctl_set_timeout invokes lookup_user_key(), which now returns -EKEYREVOKED if keyctl_revoke was used to delete the key. > Probably the default key permissions do not allow > nfsidmap to set the timeout during instantiation. I think that the > reason it works with EL6 nfsidmap is because EL6 links keys to child > keyrings to work around the keyring limit, and KEY_POS_SETATTR then > allows the timeout to be set. I'll look into it. > > Ben -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html