My CUPS daemon is still locking up; I keep looking at this and keep getting nowhere :-(
I don't know what causes the cups/backend/serial process to start, as it's not always there. Turning modem/.printer on & off doesn't seem to do it, neither does checking the queue (I can get a lock-up with the printer off and no attempt at printing having been made).
If ever the cupsd needs restarting (rpm update or logrotate), I usually find it hung on this cups/backend/serial process.
This process has one file open - a pipe (don't know what to do with the [nnn] number associated with it). Attaching strace to it immediately makes it carry on and disappear, as it probably shouyld have in the first place.
Similarly, jumping in with gdb tells me that I'm in "??" inside "read()" inside "main()". Continuing reports a SIGPIPE (I think I get that with the strace too) and an immediate exit (again).
I'm at a complete loss. Why is this process just sitting there, and how come strace/gdb makes it work again? I've even tried forcing innocuous signals at it to make it wake up (SIGPIPE included, IIRC) but to no avail.
A long listing ("ls -l") of files in "/proc/PID/fd/" may produce entries of a form like "pipe:[XXXXX]" or "socket:[XXXXX]". "pipe" refers to the pipe filesystem ("pipefs"); "socket" refers to the socket filesystem ("sockfs"). The filesystems used by the kernel can be viewed in "/proc/filesystems". "XXXXX" refers to the inode in the appropriate filesystem. Remember that each filesystem has its own inodes. Neither pipefs nor sockfs are associated with a block device (hence the "nodev" entries in "/proc/filesystems"); they are internal to the kernel.
The list-open-files command ("lsof") can be used to view information concerning the pipes that processes may be using (and a wealth of other information). The manpage for lsof is extensive. Use lsof to cross-
reference the inodes in use.
A pipe should have at least one process reading and one process writing. It may be that a reader is blocked by an absent writer. If the same process is both reader and writer, then that process may be deadlocked. The reader and writer may be miscommunicating. Since attaching a tracer or a debugger to the "backend" process causes a continuation, one might suspect some sort of nondeterministic condition such as a race or a deadlock.
If either the reader or the writer process is unattached, try altering the CUPS restart script "/etc/rc.d/init.d/cups" to save the outputs of lsof before and after the "stop" and the "start" and then compare. Remember that this type of time dependent alteration can affect a nondeterministic process in deceptive ways.
The manpage for the CUPS backend transmission interfaces (those found in "/usr/lib/cups/backend/") is called "backend". It describes the commandline interface options. You might try temporarily renaming "/
usr/lib/cups/backend/serial", replacing it with "cat" or "tee", and see what kind of data is used. Process substitution (see the "bash" manpage) might be usable in some experimental way. You might also read the CUPS source code.
----------------- major@xxxxxxxxxxx
-- Shrike-list mailing list Shrike-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/shrike-list