koji db issues (2016-02-23 and 24)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I thought I would send out a note on what we know so far, and status on
koji db. 

Last night the vm that runs koji's database went unresponsive. 
Rebooting it made it come up in a very degraded state where it couldn't
find some of it's disks and never fully started. An additional reboot
brought it back up, but as soon as it started serving database requests
it went into a state where it was waiting for i/o and not processing. 
We rebooted the virthost that runs that vm a number of times as well as
engaged networking and storage folks to look at those things. 

In working on this issue we: 

* ran a database vacuum on the koji db. 
* fixed a misconfiguration in our multipathd config. 
* fixed a configuration issue on a related virthost that made it's vm's
  unable to connect to the storage network.

Finally things seemed to settle down early this morning and we were
able to bring the database back online. Later in the morning there was
another short period of heavy i/o wait, but it recovered without
intervention on our part. 

The root cause seems to be the iscsi netapp volume that the instance
was defined on had some connectivity or loading issues and wasn't able
to handle the load for the vm. We have storage folks looking for issues
on the netapp side of things and we are closely watching the server
end. 

Hopefully we are all back on track now. 

kevin

Attachment: pgp7fmIwMMrpI.pgp
Description: OpenPGP digital signature

_______________________________________________
infrastructure mailing list
infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
http://lists.fedoraproject.org/admin/lists/infrastructure@xxxxxxxxxxxxxxxxxxxxxxx

[Index of Archives]     [Fedora Development]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]

  Powered by Linux