On Mon, 2016-02-22 at 08:09 -0500, Kamil Paral wrote: > > or it's relatively easy to do a numeric sort > > instead (just filter all the non-digit characters and do a numeric sort > > on the rest). > I don't follow you here. If we have > 24-20160510.2.c > 24-20160510.10.c > Then after filtering out non-digit characters we have: > 24201605102 > 242016051010 > which still sorts incorrectly: > 242016051010 > 24201605102 > > Unless you meant to parse the compose index out and then do a numeric > sort just on that. That works. Sorry, yes, that's what I meant. Didn't explain it very well. > > Note that we don't really *need* to indicate the compose 'type' in the > > ID. We could instead just have it in the compose metadata. I don't care > > strongly either way, though I think it's maybe slightly more convenient > > to have it in the ID. > Originally I wanted to say I like it, because it's then easy to go > through the list of all the composes and still see where Alpha > candidates started (if you roughly know the date) and which was the > final Alpha compose (the last candidate). But then I realized that > doesn't have to be true, the last candidate doesn't have to be the > one accepted for release. So I don't know if we're not making it easy > with the TYPE tag to do the same mistake by other people. (OTOH, the > same problem is with our usual Alpha RCx names, the last one doesn't > have to be the release one). > > Still, it seems to help humans, and we should document how to do it > properly with "automata". That's a good point, yeah. Another reason we have to explicitly track the candidate selected as the release somewhere. > > Also not visible in the mockup: "compose override" packages are *always > > included in all types of compose*. This is the concept Dennis and I > > came up with for handling blocker / freeze exception fixes; it's just a > > more formal version of the current process, really, whereby we mark > > packages that should be pulled into composes. At present these are only > > pulled into TCs and RCs, they never appear in the old-style "nightly > > composes". I believe we should *always* pull them in; it makes the > > system a good deal simpler. > I'm a bit confused here. The override packages were used for TCs and > RCs, but not for Live nightlies done in Koji. Pungi4 will now do all > of that as part of a single compose, daily, right? So are you just > saying those override packages will end up used in all compose > artifacts produced, and we no longer need to care about the TC+RC vs > Live nightly difference? In that case that's great. Yes, that is what I'm suggesting. If we still have the difference, it makes things quite messy primarily for constructing an "order" of composes, because say we do a snapshot the morning after an RC is blessed; if we haven't done the stable push by then, the snapshot will inarguably have been built *later* than the RC, but will contain older packages than it. So what's the order? This is how things work ATM, and I kinda hate it. And it just makes sense to me that "compose override" packages should wind up in all the composes, anyhow. There's no reason to leave them out of snapshots. > > There's also another issue we could use 'nominated' to answer. That is: > > when exactly do we build 'CANDIDATES'? Do we follow the current process > > and build them only on manual request, meaning that effectively every > > 'CANDIDATE' is equivalent to a current RC? Or do we build a 'CANDIDATE', > > say, *every time the "compose overrides" set changes*, and then > > 'nominate' RCs from the larger set of CANDIDATEs? If we want to do that, > > then the 'nominated' attribute for CANDIDATE composes would indicate > > which were selected as RCs. > I like the latter. More automation, less manual work for releng. I tend to agree. I guess I should make it explicit here that I'm kind of expecting releng, QA and any other relevant teams to be building tooling around this which causes the relevant actions to happen either automatically based on fedmsg's, or manually triggered. And I was mostly considering the "do it automatically" case. i.e., my driving consideration here was "how can we make it so every time a compose appears, the whole chain of appropriate actions kicks off and runs entirely automatically" - releng tools decide whether to stage the compose somewhere else, openQA etc. fire up and start testing, relval (or in future something less dumb) decides whether to create a manual validation "event", etc etc. > > > > That's pretty much the entire system. I had thought about things like > > storing compose "identifiers" like RC2, RC3 etc. in the compose metadata > > or in PDC directly, and stuff like requiring PDC to construct and store > > "sequences" of releases. But with this design, I don't *think* any of > > that is necessary. I believe the constraints specified in the proposal > > and the information in the compose IDs and the extra PDC fields is > > actually sufficient to all the tooling purposes I can think of. The idea > > is that tools can simply query PDC for groups of composes and apply > > logic to construct certain ideas. > > > > For instance: say we decided we're going to build CANDIDATEs for every > > change to "compose overrides", and we now want to "nominate" an RC. We > > can just ask PDC for all CANDIDATEs for the current release which have > > been "nominated" so far, and it's trivial to produce the sequence of RC > > names from that and determine what ours should be. (To spot the > > milestone changes you just look for the composes which also have the > > 'release' attribute). So the releng tool to stage the CANDIDATE as an RC > > and the QA tool to create wiki pages can easily produce a nice "RC name" > > for humans, if we want to do that. > For 24-20160510.0.c, it seem quite natural to me to talk about it as > "the candidate from May 10". So even if we scratch "RC2", it doesn't > seem as a big deal to me. The only issue I see is with 0-based index > numbering, because for 24-20160510.2.c many people will say "the > second candidate from May 10", which it is not, and confusion will > arise. You did a similar mistake above when counting the number of > possible single-digit-index composes (and I realized it only when > writing this paragraph). Human mind can be deceiving. Should we start > indexing from 1, if we were to abandon "RCx" names? (Despite all that > I have to say I really like the zero there, it makes the date more > readable, zeroes are somehow clutterless :)). Well, that's a classic problem: 1s work better for humans, 0s work better for computers (starting at 1 makes humans less likely to make more mistakes, but makes us more likely to make mistakes in code because computers tend to start counting from 0 so everywhere we do indexing we have to remember to add 1, more or less). For me it probably depends how much importance we as humans place on the compose IDs. That's the big problem I've been struggling with for the entire weekend, in fact! Seriously: naming is the hardest thing. I was out snowboarding on Saturday and kept nearly running into trees because I was too busy thinking about what a compose ID should be, what we ought to do with it, and suchlike concerns. The more I think about it the more I tend to like my general proposal from the follow-up to this email, which is that we should build the tooling to think about the *essential properties* of composes. This is in specific contrast to building the tooling to work on *constructed* properties like the 'SNAPSHOT', 'CANDIDATE' and 'POSTRELEASE' concept I came up with above. See, look at it like this: What do 'SNAPSHOT', 'CANDIDATE' and 'POSTRELEASE' really mean? They're what I'm currently thinking of as "synthetic properties", which really stand for a bunch of essential properties reduced in a particular conceptual way. The problem with this is that the way we bundle the properties up is *innately tied to our current release process*. The more I think about it, the more I'm concerned that by baking those into the compose IDs we're essentially baking certain properties of the current release process in at too low a level. So say we look at SNAPSHOT, CANDIDATE and POSTRELEASE composes. The essential property that differs between them is really only *when* in the (current) release cycle they're built: SNAPSHOTs before the relevant freeze point is reached or all blockers for it are addressed, CANDIDATEs after the freeze point for a milestone is reached and all blockers addressed, POSTRELEASE after the "final" release is done. But they all kind of *assume* a few essential properties that aren't stated: i) built from the repositories and compose metadata appropriate to the current 'main sequence' release process ii) targeting all the images in current 'main sequence' release composes iii) for the current primary arches and the whole scheme for naming them is quite innately tied to *the current monolithic release process*. So by adopting this naming scheme, we're kind of inadvertently encoding an awful lot of things about the current release process right into the compose IDs, which I suspect may be a mistake. Even the *release number* is really a synthetic property: "this is a Fedora 24 compose" is only a valid concept based on the system where we make regular numbered releases of Fedora (and a single sequence of increasingly-numbered releases, at that). If we bake that into the compose ID, then what if we switched to a rolling release model? What 'release number' do we use for composes from that point onwards? Just hardcode one and have it sitting there uselessly in the compose ID forever more? Change the scheme? The compose ID's job, ultimately, is to be just that: an *identifier*. To let us uniquely identify a particular compose, mainly in order to ask other tools for information about it or to do something with it. The more I think about it the less I like the idea of encoding properties of the release in it, especially these *synthetic* properties that may become incorrect or irrelevant. So I start wanting to simplify more and more. The current scheme for Rawhide composes is actually not bad, because it only encodes three properties, and two of those are 'essential': i) Release number ii) Compose date iii) Index relevant to other composes on the same date My suggestion to just give each compose a numerical compose ID which increments by 1 every time we do a compose - *any* compose - is not very different from just doing "YYYYMMDD.(index)" or something similar. It's just that the latter kinda arbitrarily selects "date of compose" as an 'essential property' that's indicated by the compose ID instead of / as well as metadata (or PDC). So yeah: on second thoughts, I don't think it's a good idea to construct and denote 'synthetic properties' in the compose ID. It is significantly *less* of a problem to do it in the compose metadata / PDC, because - bluntly - you can always ignore them. It just becomes a question of what's the best design: do we aim only to store 'essential' properties in compose metadata / PDC, and construct 'synthetic' properties like "this could be an RC" above that level, or do we build the logic for constructing such properties into the compose process or PDC and store them there? I don't know for sure, but I think it's essential that we understand that when we talk about "candidates" and "types" and even "release numbers" *that's what we're doing* and always keep it in mind when we build stuff, and build anything which constructs or uses such "synthetic properties" with the possibility that those things may change in mind. To take a concrete example: where I started with all this was updating wikitcms to work with Pungi 4 composes. I cheerfully started in on a plan to just change the validation page names a bit to include the 'new' compose IDs I originally proposed, or perform a fairly simple translation from the compose ID to a Wikitcms "Fedora 24 Final RC3"- type name. Then I started thinking harder, and it all got a bit...squishier... My current planned approach is to treat compose IDs as an arbitrary identifier, and conceive part of wikitcms's role as being to analyze the essential properties of a compose and decide: i) Is there a manual validation test event for this compose? If not, should there be, and... ii) What do we call it? To circle back to where I came in, ii) is by far the hardest part of this. =) When you look at it in these terms, python-wikitcms and fedfind have been doing a kind of interesting job: constructing a system for naming composes - a 'compose ID' by any other name - and maintaining a mapping between compose names and release validation event names. A property of Wikitcms' design that I had not previously considered is this: it expects there to be a predictable relationship between the compose name and the release validation event name. To put it simply - we expect to be able to parse the name of a release validation testing page, and from that information alone, identify and locate the compose it is for. This hasn't been a problem so far, because I actually built Wikitcms and fedfind kinda backwards - their conception of how composes should be identified is actually derived directly from how we name release validation events. :P In A World Where my tools don't control the compose naming concept, and for all the reasons stated above we might want to give composes very uninformative 'names', this doesn't work any more. So now I'm thinking about how and *where* to store the information "manual validation test event X is for compose Y" - both in our *current* (stupid) manual validation test system Wikitcms, and how we ought to do it in the *sane* manual validation test system we've been thinking about building lately. And in the latter case, how it should be done in such a way that it is extensible to cases where we might want to produce very *different* "test events" for different types of compose. Fun stuff! -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net http://www.happyassassin.net -- test mailing list test@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe: http://lists.fedoraproject.org/admin/lists/test@xxxxxxxxxxxxxxxxxxxxxxx