On 13-09-2023 14:58, Ilya Dryomov wrote:
On Wed, Sep 13, 2023 at 9:20 AM Stefan Kooman <stefan@xxxxxx> wrote:
Hi,
Since the 6.5 kernel addressed the issue with regards to regression in
the readahead handling code... we went ahead and installed this kernel
for a couple of mail / web clusters (Ubuntu 6.5.1-060501-generic
#202309020842 SMP PREEMPT_DYNAMIC Sat Sep 2 08:48:34 UTC 2023 x86_64
x86_64 x86_64 GNU/Linux). Since then we occasionally see the following
being logged by the kernel:
[Sun Sep 10 07:19:00 2023] workqueue: delayed_work [ceph] hogged CPU for
>10000us 4 times, consider switching to WQ_UNBOUND
[Sun Sep 10 08:41:24 2023] workqueue: ceph_con_workfn [libceph] hogged
CPU for >10000us 4 times, consider switching to WQ_UNBOUND
[Sun Sep 10 11:05:55 2023] workqueue: delayed_work [ceph] hogged CPU for
>10000us 8 times, consider switching to WQ_UNBOUND
[Sun Sep 10 12:54:38 2023] workqueue: ceph_con_workfn [libceph] hogged
CPU for >10000us 8 times, consider switching to WQ_UNBOUND
[Sun Sep 10 19:06:37 2023] workqueue: ceph_con_workfn [libceph] hogged
CPU for >10000us 16 times, consider switching to WQ_UNBOUND
[Mon Sep 11 10:53:33 2023] workqueue: ceph_con_workfn [libceph] hogged
CPU for >10000us 32 times, consider switching to WQ_UNBOUND
[Tue Sep 12 10:14:03 2023] workqueue: ceph_con_workfn [libceph] hogged
CPU for >10000us 64 times, consider switching to WQ_UNBOUND
[Tue Sep 12 11:14:33 2023] workqueue: ceph_cap_reclaim_work [ceph]
hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
We wonder if this is a new phenomenon, or that it's rather logged in the
new kernel and it was not before.
Hi Stefan,
This is something that wasn't logged in older kernels. The kernel
workqueue infrastructure is considering Ceph work items CPU intensive
and reports that in dmesg. This is new in 6.5 kernel, the threshold
can be tweaked with workqueue.cpu_intensive_thresh_us parameter.
Thanks. I was just looking into it (WQ_UNBOUND), alloc_workqueue(), etc.
The patch by Tejun Heo on workqueue also mentions this:
* Concurrency-managed per-cpu work items that hog CPUs and delay the
execution of other work items are now automatically detected and
excluded from concurrency management. Reporting on such work items can
also be enabled through a config option.
This does imply that the Ceph work items are "excluded from concurrency
management", is that correct? And if so, what does that mean in
practice? Might this make the process of returning / claiming caps to
the MDS slower?
In 6.6-rc1 more workqueue work is done and more fine tuning seems
possible. If there are any recommendations from a cephfs kernel client
perspective on what a good policy would be, we would love to hear about
that.
For now we will just disable the detection (cpu_intensive_thresh_us=0)
and see how it goes.
Thanks,
Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx