Re: name=systemd cgroup mounts/hierarchy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you for checking!

Yes, it clearly seems that systemd and kubelet in such setup shares cgroups which is not supposed.
We will prioritize moving our cluster to use systemd cgroup driver to avoid such conflict.
Also I think it would be good to have extra check on kubelet side to avoid running cgroupfs driver on systemd systems. But it’s question to k8s folks which already rised in slack.
 
-----
Just out of curiosity, how systemd in particular may be disrupted with such record in root of it’s cgroups hierarchy as /kubpods/bla/bla during service (de)activation?
Or how it may disrupt the kubelet or workload running by it?

Will it delete such records because of some logic? Or there will be name conflict during cgroup creation?
Would be happy to know more details of cgroups interference.

I've read few articles:
https://systemd.io/CGROUP_DELEGATION/
http://0pointer.de/blog/projects/cgroups-vs-cgroups.html
https://www.freedesktop.org/wiki/Software/systemd/ControlGroupInterface/
https://www.freedesktop.org/wiki/Software/systemd/writing-vm-managers/

even outdated one
https://www.freedesktop.org/wiki/Software/systemd/PaxControlGroups/

Seems I missed some technical details how exact it will interfere.
-----

> It may be a residual inside kubelet context when environment was prepared for a container spawned from within this context

Just last finding of this weird cgroup mount:
# find / -name '*8842def24*'
/sys/fs/cgroup/systemd/kubepods/burstable/pod7ffde41a-fa85-4b01-8023-69a4e4b50c55/8842def241fac72cb34fdce90297b632f098289270fa92ec04643837f5748c15
/sys/fs/cgroup/systemd/machine.slice/systemd-nspawn@centos75.service/payload/system.slice/host\x2drootfs-sys-fs-cgroup-systemd-kubepods-burstable-pod7ffde41a\x2dfa85\x2d4b01\x2d8023\x2d69a4e4b50c55-8842def241fac72cb34fdce90297b632f098289270fa92ec04643837f5748c15.mount

and

# machinectl list
MACHINE  CLASS     SERVICE        OS     VERSION ADDRESSES
centos75 container systemd-nspawn centos 7       -        
frr      container systemd-nspawn ubuntu 18.04   -        

2 machines listed.
Since container with id 8842def241 is not running it’s hard to understand what exactly happened, who did such mount and reproduce the conflict situation.

May I ask, how systemd-nspawn may be involved in it? Or any ideas what happened so I still have two times mounted the systemd named hierarchy?
 
Thursday, November 19, 2020 3:25 AM +09:00 from Michal Koutný <mkoutny@xxxxxxxx>:
 
Thanks for the details.

On Mon, Nov 16, 2020 at 09:30:20PM +0300, Andrei Enshin <b1os@xxxxx> wrote:
> I see the kubelet crash with error: «Failed to start ContainerManager failed to initialize top level QOS containers: root container [kubepods] doesn't exist»
> details: https://github.com/kubernetes/kubernetes/issues/95488
I skimmed the issue and noticed that your setup uses 'cgroupfs' cgroup
driver. As explained in the other messages in this thread, it conflicts
with systemd operation over the root cgroup tree.

> I can see same two mounts of named systemd hierarchy from shell on the same node, simply by `$ cat /proc/self/mountinfo`
> I think kubelet is running in the «main» mount namespace which has weird named systemd mount.
I assume so as well. It may be a residual inside kubelet context when
environment was prepared for a container spawned from within this
context.

> I would like to reproduce such weird mount to understand the full
> situation and make sure I can avoid it in future.
I'm afraid you may be seeing results of various races between systemd
service (de)activation and container spawnings under the "shared" root
(both of which comprise cgroup creation/removal and migrations).
There's a reason behind the cgroup subtree delegation.

So I'd say there's not much to do from systemd side now.


Michal
 
 
 

---

Best Regards,
Andrei Enshin

 
_______________________________________________
systemd-devel mailing list
systemd-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux