On 05/15/2018 08:12 PM, Zach Villers wrote: > > Not what folks are asking, but you can pick up a lot just by hanging out and watching the on-call person mutter to themselves in IRC ( not sure that is the right phrase umm... ) I don't know if this will translate for everyone, but there is a concept of "rubber duck" problem solving, where if you have a particularly difficult issue, and you explain it to someone, it helps you solve the problem more easily. I don't know if this is how everyone works and it really doesn't help if someone is jumping up and down and quacking while your are trying to think. I guess my point is, just hanging out unobtrusively when you can is fairly helpful all around. Yeah! https://en.wikipedia.org/wiki/Rubber_duck_debugging But yes, we already do talk about what we are doing and whats going wrong and what the fix might be in IRC. Everyone welcome to watch/ask questions. > If I recall, alerts are pretty easily accessible. You can poke around on Nagios if there are issues. Obviously if everything is down/red, it's not a good time to ask for help with your ssh access. Indeed. Yep. Nagios is pretty available to all. I can try and make sure I note exactly what I am doing to clear an alert... sometimes I am bad about saying "fixing that" or "poking that" without saying what exactly is going on. > > A couple ideas; > > - stream your terminal session when working an outage ( could be hard to find a 100% foss version that is secure ) Yeah. ;( There are things like tmux that might make this possible. However, we do want to make sure someone else cannot take control of our sessions. :) I did look for some screencast type software for command line a while back and was disappointed that all of them needed some non free or website to view/decode. ;( I guess there is always 'typescript' > - plan some outages in stage for apprentices to work at some time when tickets are low and nothing urgent is planned ( I don't know that I've ever heard of such a time, but in theory it could exist ) Ha. Yeah, thats a nice idea. One thing I would very much like to do is move all the *stg* services to a noc01.stg instance that doesn't page, just irc and non urgent email. Once thats seperate we could indeed try and run some alerts. :) > > Of course, it's late here, so this may all turn out to be nonsense, but good discussion anyway. No no, it was great. I think discussing this is good... and hopefully we can get folks more involved (however that happens). kevin
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx