Re: [RFC] cgroup gets release after long time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 16, 2019 at 12:39:15PM +0200, Jiri Olsa wrote:
> hi,
> Pavel reported an issue with bpf programs (attached to cgroup)
> not being released at the time when the cgroup is removed and
> are still visible in 'bpftool prog' list afterwards.

Hi Jiri!

Can you, please, try the patch from
https://github.com/rgushchin/linux/commit/f77afa1952d81a1afa6c4872d342bf6721e148e2 ?

It should solve the problem, and I'm about to post it upstream.

Thanks!

> 
> It seems like this is not bpf specific, because I was able
> to cut the bpf code from his example and still see delayed
> release of cgroup.
> 
> It happens only on cgroup2 fs (booted with systemd.unified_cgroup_hierarchy=1
> kernel command line option), please check the attached program
> below and following scenario:
> 
> TERM 1
> # gcc -o test test.c
> 
> 			TERM 2
> 			# cd /sys/kernel/debug/tracing
> 			# echo 1 > events/cgroup/cgroup_release/enable
> 
> TERM 1 -> create and remove cgroup1
> # ./test group1
> qemu-system-x86_64: terminating on signal 15 from pid 1775 (./test)
> 
> 			TERM 2
> 			# cat trace_pipe
> 			<nothing>
> 
> TERM 1 -> create and remove cgroup2
> # ./test group2
> qemu-system-x86_64: terminating on signal 15 from pid 1783 (./test)
> 
> 			TERM 2  - group1 being released
> 			# cat trace_pipe
> 			kworker/22:2-1135  [022] ....  2947.375526: cgroup_release: root=0 id=78 level=1 path=/group1
> 
> TERM 1 -> create and remove cgroup3
> # ./test group3
> qemu-system-x86_64: terminating on signal 15 from pid 1798 (./test)
> 
> 			TERM 2 - group2 being released
> 			# cat trace_pipe
> 			kworker/22:2-1135  [022] ....  2947.375526: cgroup_release: root=0 id=78 level=1 path=/group1
> 			kworker/22:0-1787  [022] ....  2961.501261: cgroup_release: root=0 id=78 level=1 path=/group2
> 
> 
> Looks like the previous cgroup release is triggered by creating
> another cgroup.  If I don't do anything the cgroup is released
> (tracepoint shows) in about 90 seconds.
> 
> The cgroup_release tracepoint is triggered in css_release_work_fn,
> the same function where the cgroup_bpf_put is called, hence the
> delay in releasing of the bpf programs.
> 
> Is this expected or somehow configurable? It's confusing seeing
> all the bpf programs from removed cgroups being around. In Pavel's
> setup it's about 100 of them.
> 
> Note, I could reproduce this only with qemu-kvm being run in child
> process in the example below.
> 
> thoughts? thanks,
> jirka
> 
> 
> ---
> #include <fcntl.h>
> #include <signal.h>
> #include <stdio.h>
> #include <string.h>
> #include <sys/stat.h>
> #include <sys/types.h>
> #include <unistd.h>
> 
> #define CGROUP_PATH "/sys/fs/cgroup"
> 
> int
> main(int argc, char **argv)
> {
> 	pid_t pid = -1;
> 	char path[1024];
> 	int rc;
> 
> 	pid = fork();
> 
> 	if (pid == 0) {
> 		execl("/usr/bin/qemu-kvm",
> 		      "/usr/bin/qemu-kvm",
> 		      "-display", "none",
> 		      NULL);
> 		fprintf(stderr, "failed to start qemu process\n");
> 		_exit(-1);
> 	} else {
> 		int filefd = -1;
> 		char proc[1024];
> 
> 		snprintf(path, 1024, "%s/%s", CGROUP_PATH, argv[1]);
> 
> 		sleep(1);
> 
> 		if (mkdir(path, 0755) < 0) {
> 			fprintf(stderr, "failed to create cgroup '%s'\n", path);
> 			return -1;
> 		}
> 
> 		snprintf(proc, 1024, "%s/cgroup.procs", path);
> 
> 		filefd = open(proc, O_WRONLY|O_TRUNC);
> 		if (filefd > 0) {
> 			dprintf(filefd, "%u", pid);
> 			close(filefd);
> 		}
> 
> 		sleep(1);
> 	}
> 
> 	if (pid > 0)
> 		kill(pid, SIGTERM);
> 	do {
> 		rc = rmdir(path);
> 	} while (rc != 0);
> 
> 	return 0;
> }




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux