cgroup_destroy kworker loops on hugetlb_cgroup_css_offline

Adrian Moreno <amorenoz@xxxxxxxxxx> · Tue, 17 Nov 2020 09:39:32 +0100

Hello Mike,

I don't usually work on the kernel so please excuse any inaccuracies.

I'm contacting you off-list because, if what I've facing is confirmed, it might
be considered a security issue (DoS). I'll leave that to your judgement.

I'm seeing an issue related to hugetlb_cgroup:

I'm running:
kubernetes 1.19 + containerd/docker
kernel 5.9.0-36.fc34.x86_64
kernel params: systemd.unified_cgroup_hierarchy=0 default_hugepagesz=1G
hugepagesz=1G hugepages=10

I'm still trying to isolate aspects of my setup, currently my reproducer is:
1 - Start a simple pod that uses the recently added HugePages medium feature [1]
(pod yaml attached)
2 - Start a DPDK app. It doesn't need to run successfully (as in transfer
packets) nor interact with real hardware. It seems just initializing the EAL
layer (which handles hugepage reservation and locking) is enough to trigger the
issue
3 - Delete the Pod (or let it "Complete").

Results in what seems to be a thread endlessly looping over a spin_lock.

top:
 1425 root      20   0       0      0      0 R  99.7   0.0   5:22.45
kworker/28:7+cgroup_destroy

'perf top -g' reports:
-   63.28%     0.01%  [kernel]                    [k] worker_thread
   - 49.97% worker_thread
      - 52.64% process_one_work
         - 62.08% css_killed_work_fn
            - hugetlb_cgroup_css_offline
                 41.52% _raw_spin_lock
               - 2.82% _cond_resched
                    rcu_all_qs
                 2.66% PageHuge
      - 0.57% schedule
         - 0.57% __schedule

Under certain circumstances (which I'm still trying to understand) this makes
the kernel quite unresponsive, requiring a hard reboot.

I've isolated the issue in a VM and I was about to start bisecting the issue
(which does not happen on kernel-5.6.6-300.fc32).

Do you have any clue or pointer as to how to further troubleshoot this issue?

Thanks,

-- 
Adrián Moreno

[1] https://kubernetes.io/docs/tasks/manage-hugepages/scheduling-hugepages/
Attachment:
test.yaml

Description: application/yaml