On Thu Apr 30 19:01:51 UTC 2020, Kumar Kartikeya Dwivedi <memxor at gmail.com> wrote: > My educated guess is that the session slice is taking time to clean up > because of the default stop timeout due to processes in the session > not responding to SIGTERM (it is waiting for dependent units to stop > before stopping itself by virtue of being ordered after them), and in > the meantime a new login is initiated. Now, the stop job on the user > slice is still waiting, and looking at the code in logind, > manager_start_scope uses the job mode fail, which means the start job > for the user slice won't be able to cancel the sitting stop job, hence > the transaction cannot be applied. When you say the session slice is taking time to clean up, please excuse my ignorance as I'm not really up to speed on how session slices are being managed by systemd, but why would it matter if a slice takes time to clean up? Is there a limit on how many SSH session slices a user can have or something? I don't see how that would cause this particular error. This is happening a few times a day. Looking at the logs, we have SSH sessions opening and closing successfully all day long for various different accounts. Occasionally we see the problematic error message. I've correlated with other messages and discovered that at the same time that we get this in /var/log/messages (the original reported error): 2020-05-03T16:09:38.735265-04:00 jupiter systemd[1]: Requested transaction contradicts existing jobs: Transaction for session-240481.scope/start is destructive (user-17132.slice has 'stop' job queued, but 'start' is included in transaction). 2020-05-03T16:09:38.735430-04:00 jupiter systemd-logind[1588]: Failed to start session scope session-240481.scope: Transaction for session-240481.scope/start is destructive (user-17132.slice has 'stop' job queued, but 'start' is included in transaction). 2020-05-03T16:09:38.735642-04:00 jupiter systemd[1]: Removed slice User Slice of jupiter-user. ... we also see this message in /var/log/secure: 020-05-03T16:09:38.737122-04:00 jupiter sshd[11031]: pam_systemd(sshd:session): Failed to create session: Resource deadlock avoided 2020-05-03T16:09:38.737380-04:00 jupiter sshd[11031]: pam_unix(sshd:session): session opened for user jupiter-user by (uid=0) Does this 'Resource deadlock avoided' message from pam_systemd help identify the root cause, or is that just a side-effect? Thanks Mark _______________________________________________ systemd-devel mailing list systemd-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/systemd-devel