Re: Strange segmentation violations of rpc.gssd in Debian Buster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Bruce,
I got the following stack and back trace:

root@all:~# coredumpctl debug
           PID: 6356 (rpc.gssd)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 11 (SEGV)
     Timestamp: Thu 2020-06-25 11:46:08 CEST (3h 4min ago)
  Command Line: /usr/sbin/rpc.gssd -vvvvvvv -rrrrrrr -t 3600 -T 10
    Executable: /usr/sbin/rpc.gssd
 Control Group: /system.slice/rpc-gssd.service
          Unit: rpc-gssd.service
         Slice: system.slice
       Boot ID: XXXXXXXXXXXXXXXXXXXXXXXXXXX
    Machine ID: YYYYYYYYYYYYYYYYYYYYYYYYYYYY
      Hostname: XYZ
       Storage: /var/lib/systemd/coredump/core.rpc\x2egssd.0.7f31136228274af0a1a855b91ad1e75c.6356.1593078368000000.lz4
       Message: Process 6356 (rpc.gssd) of user 0 dumped core.
                
                Stack trace of thread 14174:
                #0  0x000056233fff038e n/a (rpc.gssd)
                #1  0x000056233fff09f8 n/a (rpc.gssd)
                #2  0x000056233fff0b92 n/a (rpc.gssd)
                #3  0x000056233fff13b3 n/a (rpc.gssd)
                #4  0x00007fb2eb8dbfa3 start_thread (libpthread.so.0)
                #5  0x00007fb2eb80c4cf __clone (libc.so.6)
                
                Stack trace of thread 6356:
                #0  0x00007fb2eb801819 __GI___poll (libc.so.6)
                #1  0x00007fb2eb6e7207 send_dg (libresolv.so.2)
                #2  0x00007fb2eb6e4c43 __GI___res_context_query (libresolv.so.2)
                #3  0x00007fb2eb6bf536 __GI__nss_dns_gethostbyaddr2_r (libnss_dns.so.2)
                #4  0x00007fb2eb6bf823 _nss_dns_gethostbyaddr_r (libnss_dns.so.2)
                #5  0x00007fb2eb81dee2 __gethostbyaddr_r (libc.so.6)
                #6  0x00007fb2eb8267d5 gni_host_inet_name (libc.so.6)
                #7  0x000056233ffef455 n/a (rpc.gssd)
                #8  0x000056233ffef82c n/a (rpc.gssd)
                #9  0x000056233fff01d0 n/a (rpc.gssd)
                #10 0x00007fb2ebab49ba n/a (libevent-2.1.so.6)
                #11 0x00007fb2ebab5537 event_base_loop (libevent-2.1.so.6)
                #12 0x000056233ffedeaa n/a (rpc.gssd)
                #13 0x00007fb2eb73709b __libc_start_main (libc.so.6)
                #14 0x000056233ffee03a n/a (rpc.gssd)

GNU gdb (Debian 8.2.1-2+b3) 8.2.1
[...]
Reading symbols from /usr/sbin/rpc.gssd...(no debugging symbols found)...done.
[New LWP 14174]
[New LWP 6356]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/rpc.gssd -vvvvvvv -rrrrrrr -t 3600 -T 10'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000056233fff038e in ?? ()
[Current thread is 1 (Thread 0x7fb2eaeba700 (LWP 14174))]
(gdb) bt
#0  0x000056233fff038e in ?? ()
#1  0x000056233fff09f8 in ?? ()
#2  0x000056233fff0b92 in ?? ()
#3  0x000056233fff13b3 in ?? ()
#4  0x00007fb2eb8dbfa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#5  0x00007fb2eb80c4cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) quit


I am not an expert in analyzing stack and backtraces. Is there anything meaningful, you are able to extract from the trace?
As far as I see, thread 14174 caused the segmentation violation just after its birth on clone. 
Please correct me, if I am in error.
Seems Debian Buster does not ship any dedicated package with debug symbols for the rpc.gssd executable. 
So far, I was not able to find such a package.
What's your opinon about the trace?


Best and Thanks
Sebastian

_____________________________
Sebastian Kraus
Team IT am Institut für Chemie
Gebäude C, Straße des 17. Juni 115, Raum C7

Technische Universität Berlin
Fakultät II
Institut für Chemie
Sekretariat C3
Straße des 17. Juni 135
10623 Berlin

Email: sebastian.kraus@xxxxxxxxxxxx

________________________________________
From: linux-nfs-owner@xxxxxxxxxxxxxxx <linux-nfs-owner@xxxxxxxxxxxxxxx> on behalf of J. Bruce Fields <bfields@xxxxxxxxxxxx>
Sent: Tuesday, June 23, 2020 00:36
To: Kraus, Sebastian
Cc: linux-nfs@xxxxxxxxxxxxxxx
Subject: Re: RPC Pipefs: Frequent parsing errors in client database

On Sat, Jun 20, 2020 at 09:08:55PM +0000, Kraus, Sebastian wrote:
> Hi Bruce,
>
> >> But I think it'd be more useful to stay focused on the segfaults.
>
> is it a clever idea to analyze core dumps? Or are there other much better debugging techniques w.r.t. RPC daemons?

If we could at least get a backtrace out of the core dump that could be
useful.

> I now do more tests while fiddling around with the time-out parameters "-T" and "-t" on the command line of rpc.gssd.
>
> There are several things I do not really understand about the trace shown below:
>
> 1) How can it be useful that the rpc.gssd daemon tries to parse the info file although it knows about its absence beforehand?

It doesn't know beforehand, in the scenarios I described.

> 2) Why are there two identifiers clnt36e and clnt36f being used for the same client?

This is actually happening on an NFS server, the rpc client in question
is the callback client used to do things like send delegation recalls
back to the NFS client.

I'm not sure why two different callback clients are being created here,
but there's nothing inherently weird about that.

> 3) What does the <?> in "inotify event for clntdir (nfsd4_cb/clnt36e) - ev->wd (600) ev->name (<?>) ev->mask (0x00008000)" mean?

Off the top of my head, I don't know, we'd probably need to look through
header files or inotify man pages for the definitions of those masks.

--b.




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux