It happen again.But now I have more traces and hints.
This morning (9:33 UTC) I get Nagios alert that:
WARN: datanommer has not seen a copr message in 6 hours, 10 minutes, 39 seconds
which means that sometime between 3:30 UTC and 4:30 UTC something happen.
I logged to copr-be and to my surprise:
ansible-playbook -vvvv -c ssh /home/copr/provision/builderpb.yml
ERROR: debug is not a legal parameter in an Ansible task or handler
without changing anything over night.
To my surprise I find that:
rpm -V ansible
...
missing /usr/share/ansible/utilities
missing /usr/share/ansible/utilities/accelerate
missing /usr/share/ansible/utilities/debug
missing /usr/share/ansible/utilities/fail
missing /usr/share/ansible/utilities/include_vars
missing /usr/share/ansible/utilities/pause
missing /usr/share/ansible/utilities/set_fact
missing /usr/share/ansible/utilities/wait_for
I.e. Whole content of /usr/share/ansible/utilities is missing.
I quickly reinstall ansible package and everything started working again.
Now I have to find the cause otherwise I expect that it happen again this night.
I checked syslog and only relevant informations are:
1)
Feb 28 03:46:22 dhcp-client03 systemd[1]: Got automount request for /proc/sys/fs/binfmt_misc, triggered by 24347 (find)
Feb 28 03:46:22 dhcp-client03 systemd[1]: Mounting Arbitrary Executable File Formats File System...
Feb 28 03:46:22 dhcp-client03 systemd[1]: Mounted Arbitrary Executable File Formats File System.
2)
Feb 28 04:04:05 dhcp-client03 systemd-logind[291]: New session 24 of user root.
Feb 28 04:04:05 dhcp-client03 ansible-yum: Invoked with CHECKMODE=True name=cloud-utils list=None
disable_gpg_check=False conf_file=None state=present disablerepo=None enablerepo=None
Feb 28 04:04:05 dhcp-client03 systemd-logind[291]: Removed session 24.
Feb 28 04:04:05 dhcp-client03 systemd-logind[291]: New session 25 of user root.
Feb 28 04:04:05 dhcp-client03 ansible-command: Invoked with executable=None shell=False args=growpart /dev/vda 2
removes=None creates=None chdir=None
Feb 28 04:04:06 dhcp-client03 systemd-logind[291]: Removed session 25.
Feb 28 04:04:06 dhcp-client03 systemd-logind[291]: New session 26 of user root.
Feb 28 04:04:06 dhcp-client03 ansible-setup: Invoked with CHECKMODE=True filter=* fact_path=/etc/ansible/facts.d
Feb 28 04:04:06 dhcp-client03 systemd-logind[291]: Removed session 26.
Feb 28 04:04:07 dhcp-client03 systemd-logind[291]: New session 27 of user root.
Feb 28 04:04:07 dhcp-client03 ansible-yum: Invoked with CHECKMODE=True name=fedmsg,libsemanage-python,python-psutil
list=None disable_gpg_check=False conf_file=None state=installed disablerepo=None
pkg=fedmsg,libsemanage-python,python-psutil enablerepo=None
Feb 28 04:04:42 dhcp-client03 systemd-logind[291]: Removed session 27.
I am not sure about the first one.
The second one is some ansible playbook (can it be that nirik check of differences?)
But I'm really clueless how it can remove /usr/share/ansible/utilities/*
Does somebody have some idea?
--
Miroslav Suchy, RHCE, RHCDS
Red Hat, Senior Software Engineer, #brno, #devexp, #fedora-buildsys
_______________________________________________
infrastructure mailing list
infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/infrastructure