On 5 January 2018 at 05:36, Pierre-Yves Chibon <pingou@xxxxxxxxxxxx> wrote: > Good Morning Everyone, > > There has been work on fedora-hubs for a while now and there is an objective to > make it live in staging early this year. > However, there is a question about how we want to deploy such application. > > So far we have asked that all our application be packaged in RPMs. The main > application may not be in the official Fedora repositories but (for most) we > asked that all of its dependencies are. > For example, pkgdb or pagure aren't in Fedora's repositories themselves, but we > still build them in koji pulling the dependencies from the official repos. > > Hubs is the second of our app where this model is almost not workable because it > is written in nodejs where every file is/can be a separate package and semantic > versioning sometime not very well respected. > My short reading on semantic versioning is that this is some sort of Romantic ideal of how the universe should work if we knew everything and of course nothing actually works that way because of course my breaking change is your minor upgrade. So I can understand it not being respected. > The other application we have that is in nodejs is the flock registration > application that, iirc, we run in our cloud. > > However, hubs is not meant to be run in our cloud. > > So how do we want to deploy hub? > Do we allow npm install? Do we want to use container? Should it target > openshift? > How do we want to handle updates? (especially considering the semantic > versioning aspect mentioned above) > > I am going to back this up a bit and cover things we all understand, but may want to revisit step by step to see if we have different understandings. [AKA get rid of assumptions as Dennis Gilmore said on IRC recently.] Surprise is the opposite of engagement. A. The software on each node must be the same at time of running. a. This is to make sure that user does not get one version if DNS says proxy13 for half a visit and proxy14 for another. B. The software on each node must be able to replicated through simple steps. a. This makes sure that someone else outside of Fedora can duplicate what we have. b. And that we can rebuild a box at 2 am with no sleep and not have a node which violates A. C. The software needs to be upgradeable with known steps. The reasons are similar to B. D. The software needs to be buildable with known steps. The reasons are similar to B. E. The software needs to be 'open' in a way that does not require special logins or 'secret' repositories. a. This is mainly to ensure that if we have a meltdown that the software can be set up without needing that special login or repository to get. F. The software should not be built on the box it is being run on. This is for several reasons a. Tendency for 'compiled' software to diverge during build time. Anything from clock times to 'junk' left from a different build can cause version on system A not act like system B. This leads to breaking A. b. Old security reasons were that every system should only have enough 'tools' to run what is installed and not build new stuff. This stopped attackers from being able to 'compile' rootkits locally which they would need due to architecture differences. Current technologies have either gotten too uniform to need local compilations or they are built around the idea that everything is available via scripting. c. However, there is still a need to keep a system simple and auditable. Trying to figure out if a 'build tool' (I include cpan/npm/pypi as that) got the same version and didn't leave around junk which makes another tool not work as expected is hard. It is also hard to know if some pickle, nugget, gem is leftover build turd or needed unit and why does it differ on each system but work the same. [There may be other items we are trying to meet.. but these are the ones I can think of with a headache.] In the past we use rpms as a method to make sure that we achieve as much of this as possible with one tool. We have an 'archived' via rpm an immutable version (meets A and B and F) which was built from source (which meets C and D and E). In looking at other tools we should look at how we can make them fit this mold best. So options. 1. Use rpm as the container. Just bundle it all together into one RPM and plop that onto the servers. This is basically what I used to have to do with commercial Java software long ago.. its ugly in the build side and ugly in the running side but it makes auditing easy. 2. Use dockah as the container. Plus side is that we can just deploy it like we do the mirrorlist everywhere.. Downside is that we are running F25 mirrorlist 1 month after EOL of F25 with it being a '1-2 person knows how to build the next version' and they have everything else on their 80 hour week. 3. Built it, tar it up, plop the tar ball on all the boxes that need it. [AKA it was good enough for Enterprise software from 1965->1995 its good enough for us now.] Auditability is lower but you can still make checksums of every file to see if someone messed with a server. 4. Set up our own npm repositories that we use for getting the software we want installed. This means that the software gets built using the tools it wants and we control the versions of the software it can get. 5. Screw auditability.. every box makes its node when it is built via npm and other tools... if box A doesn't match B.. just keep rebuilding them all until they get close enough. Most of the time this will work without a problem because we have designed most software to be mostly reproducible and as quickly as toolkits update they rarely do so in the middle of a build. 6. Don't deploy software like this. I am putting this in for completeness. This was the default answer for many years but has made for a lot of software not able to be used by us. There are times where this is still the right answer because we already have a lot of software which we are barely maintaining. 7. Some combination of parts of the above. I think we are going to need to look at 7. That said there are a lot of things that need to be detailed. What resources does hubs tie into? What servers does it need to be near against, What is its data backing store? Who is writing and fixing bugs in this? That will help figure out the numbers I forgot and the options. I don't think we want to have any box in production doing npm installs anymore than we want them doing pypi installs. How possible is it that we can set up our own node repositories, build a container using them, and then deploy that via docker on some systems? > What do people think? > > Thanks, > Pierre > _______________________________________________ > infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx > To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx -- Stephen J Smoogen. _______________________________________________ infrastructure mailing list -- infrastructure@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to infrastructure-leave@xxxxxxxxxxxxxxxxxxxxxxx