On Fri, Jun 26, 2020 at 01:23:54PM -0400, Doug Nazar wrote: > Ok, I think I see what's going on. The struct clnt_info is getting > freed out from under the upcall thread. In this case it immediately > got reused for another client which zeroed the struct and was in the > process of looking up the info for it's client, hence the protocol & > server fields were null in the upcall thread. > > Explains why I haven't been able to recreate it. Thanks for the > stack trace Sebastian. > > Bruce, I can't see any locking/reference counting around struct > clnt_info. It just gets destroyed when receiving the inotify. Or > should it be deep copied when starting an upcall? Am I missing > something? Thanks for finding that! Staring at that code in an attempt to catch up here.... Looks like there's one main thread that watches for upcalls and other events, then creates a new short-lived thread for each upcall. The main thread is the only one that really manipulates the data structure with all the clients. So that data structure shouldn't need any locking. Except, as you point out, to keep the clnt_info from disappearing out from under them. So, yeah, either a reference count or a deep copy is probably all that's needed, in alloc_upcall_info() and at the end of handle_krb5_upcall(). --b. > Doug > > Jun 25 11:46:08 server rpc.gssd[6356]: inotify event for topdir (nfsd4_cb) - ev->wd (5) ev->name (clnt50e) ev->mask (0x40000100) > Jun 25 11:46:08 server rpc.gssd[6356]: handle_gssd_upcall: 'mech=krb5 uid=0 target=host@xxxxxxxxxxxxxxxxxxxxxxxxxx service=nfs enctypes=18,17,16,23,3,1,2 ' (nfsd4_cb/clnt50e) > Jun 25 11:46:08 server rpc.gssd[6356]: krb5_use_machine_creds: uid 0 tgtname host@xxxxxxxxxxxxxxxxxxxxxxxxxx > Jun 25 11:46:08 server rpc.gssd[6356]: inotify event for clntdir (nfsd4_cb/clnt50e) - ev->wd (75) ev->name (krb5) ev->mask (0x00000200) > Jun 25 11:46:08 server rpc.gssd[6356]: inotify event for clntdir (nfsd4_cb/clnt50e) - ev->wd (75) ev->name (gssd) ev->mask (0x00000200) > Jun 25 11:46:08 server rpc.gssd[6356]: inotify event for clntdir (nfsd4_cb/clnt50e) - ev->wd (75) ev->name (info) ev->mask (0x00000200) > Jun 25 11:46:08 server rpc.gssd[6356]: inotify event for clntdir (nfsd4_cb/clnt50e) - ev->wd (75) ev->name (<?>) ev->mask (0x00008000) > Jun 25 11:46:08 server rpc.gssd[6356]: inotify event for topdir (nfsd4_cb) - ev->wd (5) ev->name (clnt50f) ev->mask (0x40000100) > Jun 25 11:46:08 server rpc.gssd[6356]: Full hostname for '' is 'client.domain.tu-berlin.de' > Jun 25 11:46:08 server rpc.gssd[6356]: Full hostname for 'server.domain.tu-berlin.de' is 'server.domain.tu-berlin.de' > Jun 25 11:46:08 server rpc.gssd[6356]: Success getting keytab entry for 'nfs/server.domain.tu-berlin.de@xxxxxxxxxxxx' > Jun 25 11:46:08 server rpc.gssd[6356]: INFO: Credentials in CC 'FILE:/tmp/krb5ccmachine_TU-BERLIN.DE' are good until 1593101766 > Jun 25 11:46:08 server rpc.gssd[6356]: INFO: Credentials in CC 'FILE:/tmp/krb5ccmachine_TU-BERLIN.DE' are good until 1593101766 > Jun 25 11:46:08 server rpc.gssd[6356]: creating (null) client for server (null) > Jun 25 11:46:08 all kernel: rpc.gssd[14174]: segfault at 0 ip 000056233fff038e sp 00007fb2eaeb9880 error 4 in rpc.gssd[56233ffed000+9000] > > > Thread 1 (Thread 0x7fb2eaeba700 (LWP 14174)): > #0 0x000056233fff038e in create_auth_rpc_client (clp=clp@entry=0x562341008fa0, tgtname=tgtname@entry=0x562341011c8f "host@xxxxxxxxxxxxxxxxxxxxxxxxxx", clnt_return=clnt_return@entry=0x7fb2eaeb9de8, auth_return=auth_return@entry=0x7fb2eaeb9d50, uid=uid@entry=0, cred=cred@entry=0x0, authtype=0) at gssd_proc.c:352 > > Thread 2 (Thread 0x7fb2eb6d9740 (LWP 6356)): > #12 0x000056233ffef82c in gssd_read_service_info (clp=0x562341008fa0, dirfd=11) at gssd.c:326 >