Re: Adding Dusty Access/Membership to Releng/Infra

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29 November 2017 at 09:49, Dusty Mabe <dusty@xxxxxxxxxxxxx> wrote:
> Thanks patrick for the response. I'm going to cut/snip individual pieces to respond
> to some more specific questions/comments and then try to summarize at the end.
>
>> I already asked three months ago for SOP's on how the ostree stuff works, and learning materials for releng, but have so far not seen those (https://pagure.io/releng/issue/6984#comment-460533 and some meetings).
>> I might have missed them, but if you have those resources, other releng folks can help with more things, which would massively speed things up.
>
> I wholeheartedly agree and I plan on getting those SOPs written. I'd actually like to do
> a workshop (maybe at the next hackfest) for everyone.
>
>
>> The permissions you would need to do the things you requests *do* allow you to do "willy nilly" what you want.
>
> What I was trying to say is that I'm not planning to abuse the power. If I abuse power then
> revoke access. I'm really not the type of person that makes changes without asking people. I
> would probably make a lot of noise before doing something without waiting for input from others
> and I would only do it if I thought it was critical.
>
>> E.g. access to modify ostree repos manually require full write access to /mnt/koji and /pub.
>> I have tried to point out the specific pieces of access I would be fine with, but personally I am really not happy with the large set of permissions requested, and a set of them I would really be uncomfortable with.
>> One of the reasons for that feeling from me is the number of things we have needed to do ad-hoc to ostree repositories, like resetting refs etc, because the version numbers got out of step due to ostree bugs.
>
> I agree that there have been some issues with ostree, but I think there have been a fair number of
> issues from the releng/infra side that have caused ostrees to get out of sync as well. Fortunately
> when we switch to smart versioning this will matter less and we can deal with ostree bugs or releng/
> infra transient issues with more grace.
>
>
> So in summary from your response I gather I am requesting:
>
> - Releng group on Pagure
> - Running specific playbooks
> - admin in koji
> - root on compose boxes
> - root on bodhi-backend01
> - sysadmin-fedimg
> - sysadmin-releng
> - Access to all signing keys
>
> I'll actually remove two items from that list:
>
> - Access to all signing keys
> - admin in koji
>
> I really don't want access to signing keys; I understand why that needs to
> be locked down as much as possible. Also, I don't really think I need admin in
> koji assuming root on compose box will allow me to kick off things I need to.
>
> For the rest I really don't see why it's not OK. Basically it comes down to
> two issues I can see (let me know if there are others):
>
> - trust
> - competence
>
> If I can't be trusted by now then that is unsettling and discouraging to me.
> As for competence, I guess that depends on who you ask, but I'd like to think
> I would pass that bar as well.

In security, no one is trusted. I am not trusted, Kevin is not
trusted, Patrick is not trusted. Dennis and Mohan are not trusted.
This is because we are foolable and fallible. Even worse our accounts
are not ourselves and could be used to look like us but not be us.
This gives us each a potential reliability of 0.8 and a combined
reliability of 0.1 (aka 90% of the time our root powers combined will
cause things to be worse.)

Second our infrastructure is very much an organically grown ball of
kerosene soaked yarn. You get root on box A and magically you can
affect something on box C because someone decided that <<fill in funny
named service of the week>> needed to be used for our brand new <<
fill in feature that we try in Fedora N and shoved aside in N+2 >> and
both systems use it. And we can't remove that feature or tool because
oh someone built it into koji or fas or some other critical tool to
cut down development time because it needed to be 'working' now. [Most
of our services are 0.95 reliable but combined in their myriad ways it
is probably a combined total of 0.4 -> 0.6. In another words 40-60% of
the time something is not working right for someone]

The two systems you are wanting root access to are some of the most
interconnected in the kerosene level. Everyone who does have root on
them has accidentally mangled all of Fedora in some way for several
hours or days because we were trying to fix something and broke
something else. [I think Dennis made archives completely unreadable
twice and I moved all of them to ... using a script which had been
looked at by two other sysadmins before I ran it. Those were the easy
ones to undo.. there have been others which took going through backups
to fix.]

It would be nice if those boxes weren't the worst but every time we
redesign things to be saner, we end up with a << must have this new
service installed or Fedora will FAIL >> that can only meet the
release deadline by being put on them. We try to clean up before the
next release but oh look another << if Fedora doesn't have this
feature it will FAIL >> comes up.

And when things fail, those of us in root are the ones who
collectively get blamed for it. Why did you let smooge have rights to
the system? Why did you not check his actions before he did them. How
did you not catch a ... in that line of code? It may all sound like
reasonable questions but to the people who are dealing with the
problem it gets picked up as "How did you let this moron ever do
this?" And because things fail so much.. it eventually comes across as
"You people are completely inept".

All of this:
* High chance that someone as root is going to make a mistake and the
more people who have it more likely.
* High chance that some systems will catch fire through all the rest
of infrastructure
* Perceived blame

makes that adding anyone else a very difficult process. We can be
prickly because of 3 and should mitigate it, but the real problem are
the first two in the list.


> Dusty
> _______________________________________________
> infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx



-- 
Stephen J Smoogen.
_______________________________________________
infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx




[Index of Archives]     [Fedora Development]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]

  Powered by Linux