Re: Hung CephFS client

Robert LeBlanc <robert@xxxxxxxxxxxxx> · Sun, 13 Oct 2019 11:37:43 -0700

On Sun, Oct 13, 2019 at 4:19 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>
> On Sat, 2019-10-12 at 11:20 -0700, Robert LeBlanc wrote:
> > $ uname -a
> > Linux sun-gpu225 4.4.0-142-generic #168~14.04.1-Ubuntu SMP Sat Jan 19
> > 11:26:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
> >
>
> That's pretty old. I'm not sure how aggressively Canonical backports
> ceph patches.

Just trying to understand if this may be fixed in a newer version, but
we also have to balance NVidia drivers as well.

> > This was the best stack trace we could get. /proc was not helpful:
> > root@sun-gpu225:/proc/77292# cat stack
> >
> >
> >
> > [<ffffffffffffffff>] 0xffffffffffffffff
> >
>
> A stack trace like the above generally means that the task is running in
> userland. The earlier stack trace you sent might just indicate that it
> was in the process of spinning on a lock when you grabbed the trace, but
> isn't actually stuck in the kernel.

I tried catting it multiple times, but it was always that.

> > We did not get messages of hung tasks from the kernel. This container
> > was running for 9 days when the jobs should have completed in a matter
> > of hours. They were not able to stop the container, but it still was
> > using CPU. So it smells like uninterruptable sleep, but still using
> > CPU which based on the trace looks like it's stuck in spinlock.
> >
>
> That could be anything then, including userland bugs. What state was the
> process in (maybe grab /proc/<pid>/status if this happens again?).

We still have this box up. Here is the output of status:
root@sun-gpu225:/proc/77292# cat status
Name:   offline_percept
State:  R (running)
Tgid:   77292
Ngid:   77986
Pid:    77292
PPid:   168913
TracerPid:      20719
Uid:    1000    1000    1000    1000
Gid:    1000    1000    1000    1000
FDSize: 256
Groups: 27 999
NStgid: 77292   2830
NSpid:  77292   2830
NSpgid: 169001  8
NSsid:  168913  1
VmPeak: 1094897144 kB
VmSize: 1094639324 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:   3512696 kB
VmRSS:   3121848 kB
VmData: 19331276 kB
VmStk:       144 kB
VmExe:       184 kB
VmLib:   1060628 kB
VmPTE:      8992 kB
VmPMD:        88 kB
VmSwap:        0 kB
HugetlbPages:          0 kB
Threads:        1
SigQ:   3/3090620
SigPnd: 0000000000040100
ShdPnd: 0000000000000001
SigBlk: 0000000000001000
SigIgn: 0000000001001000
SigCgt: 00000001800044e8
CapInh: 00000000a80425fb
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 00000000a80425fb
CapAmb: 0000000000000000
Seccomp:        0
Speculation_Store_Bypass:       thread vulnerable
Cpus_allowed:
00000000,00000000,00000000,00000000,00000000,00000000,ffffffff
Cpus_allowed_list:      0-31
Mems_allowed:   00000000,00000003
Mems_allowed_list:      0-1
voluntary_ctxt_switches:        6499
nonvoluntary_ctxt_switches:     28044102

> > Do you want me to get something more specific? Just tell me how.
> >
>
> If you really think tasks are getting hung in the kernel, then you can
> crash the box and get a vmcore if you have kdump set up. With that we
> can analyze it and determine what it's doing.
>
> If you suspect ceph is involved then you might want to turn up dynamic
> debugging in the kernel and see what it's doing.

I looked in /sys/kernel/debug/ceph/, but wasn't sure how to up the
debugging that would be beneficial.

We don't have a crash kernel loaded, so that won't be an option in this case.

----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1