Re: 5.4.20 - high load - lots of incoming data - small data read.

Jesper Krogh <jesper.krogh@xxxxxxxxx> · Mon, 6 Apr 2020 11:18:38 +0200

And it looks like I can reproduce this pattern - and when it has been
stuck for "sufficient" amount of time - then it gets blacklisted by
the MDS/OSD and (a related) issue is that I cannot get the mountpoint
back without a reboot on the kernel-client side.

On Mon, Apr 6, 2020 at 11:09 AM Jesper Krogh <jesper.krogh@xxxxxxxxx> wrote:
>
> home-directory style - median is about 3KB - but varies greatly.
>
> I also get this:
> [41204.865818] INFO: task kworker/u16:102:21903 blocked for more than
> 120 seconds.
> [41204.865955]       Not tainted 5.4.30 #5
> [41204.866006] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [41204.866056] kworker/u16:102 D    0 21903      2 0x80004000
> [41204.866119] Workqueue: ceph-inode ceph_inode_work [ceph]
> [41204.866120] Call Trace:
> [41204.866156]  __schedule+0x45f/0x710
> [41204.866162]  ? xas_store+0x391/0x5f0
> [41204.866164]  schedule+0x3e/0xa0
> [41204.866166]  io_schedule+0x16/0x40
> [41204.866180]  __lock_page+0x12a/0x1d0
> [41204.866182]  ? file_fdatawait_range+0x30/0x30
> [41204.866187]  truncate_inode_pages_range+0x52c/0x980
> [41204.866191]  ? syscall_return_via_sysret+0x12/0x7f
> [41204.866197]  ? drop_inode_snap_realm+0x98/0xa0 [ceph]
> [41204.866207]  ? fsnotify_grab_connector+0x4d/0x90
> [41204.866209]  truncate_inode_pages_final+0x4c/0x60
> [41204.866214]  ceph_evict_inode+0x2d/0x210 [ceph]
> [41204.866219]  evict+0xca/0x1a0
> [41204.866221]  iput+0x1ba/0x210
> [41204.866225]  ceph_inode_work+0x40/0x270 [ceph]
> [41204.866232]  process_one_work+0x167/0x400
> [41204.866233]  worker_thread+0x4d/0x460
> [41204.866236]  kthread+0x105/0x140
> [41204.866237]  ? rescuer_thread+0x370/0x370
> [41204.866239]  ? kthread_destroy_worker+0x50/0x50
> [41204.866240]  ret_from_fork+0x35/0x40
>
> On Mon, Apr 6, 2020 at 10:53 AM Yan, Zheng <ukernel@xxxxxxxxx> wrote:
> >
> > On Mon, Apr 6, 2020 at 4:06 PM Jesper Krogh <jesper.krogh@xxxxxxxxx> wrote:
> > >
> > > This is a CephFS client - its only purpose is to run the "filedaemon" of bacula
> > > and transport data to the tape-library from CephFS - below 2 threads is
> > > essentially doing something equivalent to
> > >
> > > find /cephfs/ -type f | xargs cat | nc server
> > >
> > > 2 threads only, load exploding and the "net read vs net write" has
> > > more than 100x difference.
> > >
> > > Can anyone explain this as "normal" behaviour?
> > > Server is a  VM with 16 "vCPU" and 16GB memory running libvirt/qemu
> > >
> > > jk@wombat:~$ w
> > >  07:50:33 up 11:25,  1 user,  load average: 206.43, 76.23, 50.58
> > > USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT
> > > jk       pts/0    10.194.133.42    06:54    0.00s  0.05s  0.00s w
> > > jk@wombat:~$ dstat -ar
> > > --total-cpu-usage-- -dsk/total- -net/total- ---paging-- ---system-- --io/total-
> > > usr sys idl wai stl| read  writ| recv  send|  in   out | int   csw | read  writ
> > >   0   0  98   1   0|  14k   34k|   0     0 |   3B   27B| 481   294 |0.55  0.73
> > >   1   1   0  98   0|   0     0 |  60M  220k|   0     0 |6402  6182 |   0     0
> > >   0   1   0  98   0|   0     0 |  69M  255k|   0     0 |7305  4339 |   0     0
> > >   1   2   0  98   0|   0     0 |  76M  282k|   0     0 |7914  4886 |   0     0
> > >   1   1   0  99   0|   0     0 |  70M  260k|   0     0 |7293  4444 |   0     0
> > >   1   1   0  98   0|   0     0 |  80M  278k|   0     0 |8018  4931 |   0     0
> > >   0   1   0  98   0|   0     0 |  60M  221k|   0     0 |6435  5951 |   0     0
> > >   0   1   0  99   0|   0     0 |  59M  211k|   0     0 |6163  3584 |   0     0
> > >   0   1   0  98   0|   0     0 |  64M  323k|   0     0 |6653  3881 |   0     0
> > >   1   0   0  99   0|   0     0 |  61M  243k|   0     0 |6822  4401 |   0     0
> > >   0   1   0  99   0|   0     0 |  55M  205k|   0     0 |5975  3518 |   0     0
> > >   1   1   0  98   0|   0     0 |  68M  242k|   0     0 |7094  6544 |   0     0
> > >   0   1   0  99   0|   0     0 |  58M  230k|   0     0 |6639  4178 |   0     0
> > >   1   2   0  98   0|   0     0 |  61M  243k|   0     0 |7117  4477 |   0     0
> > >   0   1   0  99   0|   0     0 |  61M  228k|   0     0 |6500  4078 |   0     0
> > >   0   1   0  99   0|   0     0 |  65M  234k|   0     0 |6595  3914 |   0     0
> > >   0   1   0  98   0|   0     0 |  64M  219k|   0     0 |6507  5755 |   0     0
> > >   1   1   0  99   0|   0     0 |  64M  233k|   0     0 |6869  4153 |   0     0
> > >   1   2   0  98   0|   0     0 |  63M  232k|   0     0 |6632  3907 |
> > > 0     0 ^C
> > > jk@wombat:~$ w
> > >  07:50:56 up 11:25,  1 user,  load average: 221.35, 88.07, 55.02
> > > USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT
> > > jk       pts/0    10.194.133.42    06:54    0.00s  0.05s  0.00s w
> > > jk@wombat:~$
> > >
> > > Thanks.
> >
> > how small these files are?