Re: Zombie processes being created when console buffer is full

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/10/2016 12:34 AM, Martin Kletzander wrote:
I'm not able to reproduce your issue, but that might be because I'm not
running systemd neither in the container nor in the host.  If we can
reproduce it without systemd, though, that would be very helpful for
finding out the cause of all this.

We're seeing this on CentOS 7.1, which is systemd based. We were able to determine that the cause of the problem is due to a container's console buffer being filled. In a container (or VM) the console of course is not a real physical device, it's a pseudo tty. With a physical console, when some process writes something to /dev/console, it appears on the physical console and if no one is there to see the text it eventually scrolls off the screen and is lost. There is no limit to how much text can be sent to the console.

In the case of a container and its pseudo console, there is a buffer associated with the console device and this buffer has a size limit. If there is an active console session open for a container, any text sent to the container's console (e.g. by systemd) is consumed and processed by the container. However, if there is no active console session, as processes continue to write to the container's console device, the buffer associated with this pseudo console fills up. When this happens, any process that attempts to write to the container's console blocks and will stay blocked forever until a console session is started. These hung processes were the source of our zombie processes.

We solved the problem by writing a console monitor service that runs on the hypervisor hosting the containers. It continually monitors the console devices of all containers and if there is an open console session for a given container, it does nothing. If however there is no active console session, it opens the console device for the container and drains it using the following Python code:

            fd = os.open(console, os.O_RDWR | os.O_NOCTTY)
            termios.tcflush(fd, termios.TCIFLUSH)
            os.close(fd);

For expediency, we do not save the text that's read. This is ultimately similar to text scrolling off the top of a physical console.

So, although this monitor service has solved our issue with zombie processes, I'm not convinced this is really the right solution. I'd like to think if a container is setup correctly, its console device should not fill up and block processes that attempt to write to it. I would think this would be a big problem for anyone running containers under libvirt_lxc. The problem is easy to reproduce in our environment: Open a console session to container and run "cat" with no arguments. Leave it running and disconnect the console session (control-]). Determine the container's console device from its xml definition, e.g. /dev/pts/3, and then copy some large file to it, e.g.

         # cp /var/log/messages /dev/pts/3

Assuming the file is larger than the console's backing buffer, this cp command should hang. If you then open a console session to this container from another window, you'll see the contents of /var/log/messages appear on the screen and the cp command in the other window will exit.

If you are unable to reproduce it in your our setup following this procedure, then something is either wrong with my container configuration or there is something more insidious going on. I'd appreciate if you could run a test with this procedure and let me know the results.

Thanks.

Peter

_______________________________________________
libvirt-users mailing list
libvirt-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvirt-users



[Index of Archives]     [Virt Tools]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]

  Powered by Linux