Hi Sage, Thanks for taking the time to write this overview of the release cycle tools and their evolutions : I did not realize so much work was going on :-) Cheers On 30/07/2014 20:22, Sage Weil wrote: > On Wed, 30 Jul 2014, Loic Dachary wrote: >> Hi Sage, >> >> From my (biased) point of view, the upside is that it will give me more >> time to complete the locally repairable code for Giant ;-). The downside >> is that it puts a little less pressure to improve the tools and methods >> that make a rapid release cycles possible (i.e. unit tests, bug >> tracking, patch acceptance workflow, package building/gitbuilder, >> teuthology, pulpito, upgrades testing, ...). In a perfect world Ceph >> could sustain a three month release cycle without inconveniencing >> anyone. A longer release cycle (five or six months) would encourage even >> more complex / bigger changes within a release cycle. It would also >> probably encourage Ceph developers to forget about the release process >> tools during two or three months and not improve them as they should be. >> >> IMHO the test cycle is significantly slowing down the release process >> and a faster, more comprehensive test cycle would help a lot. > > No argument here. :) > > I should clarify that this is the "stable release cycle" for the named > released. I still think we should maintain a ~2 week "development release > cycle" where we are continuously integrating changes and regularly putting > out a usable release. The 'next' or 'last' branches should be recent and > stable starting points for doing any new work so that the integration > tests, when run, will reflect bugs in your code and not stuff that was > already there. We've slipped a bit here (0.82 to 0.83 was 5 weeks); this > is partly because the release process itself is still pretty expensive in > terms of effort and we don't want to eat up more of Alfredo's and Sandon's > time than we need to, but it is getting better. > > In any case, the real point of a longer "stable release cycle" is just > that there are fewer stable releases in flight that we are backporting > fixes too. In practice, having all of dumpling, emperor, and firefly > outstanding hasn't worked particularly well (IMO). We backport to > dumpling and firefly and urge people away from emperor to avoid the > cognitive overhead of keeping track of another release. Going from 3 to 4 > months means only 3 stable releases per year, which I think is enough...? > >> Each commit should be unit / functional tested within seconds, locally >> (see >> https://github.com/ceph/ceph/blob/master/src/test/osd/types.cc#L1295 for >> instance). It is usually more difficult to diagnose / fix a border case >> when it is discovered during integration tests (i.e. teuthology) rather >> than with a unit / functional test designed for it. Creating unit tests >> is often problematic because some of the code base cannot be easily >> isolated. With a continuous effort to re-arrange parts of the code to be >> more test friendly, this can eventually be resolved. >> >> Every commit proposed to master should be run against the relevant >> teuthology suite to help the reviewer. The problem here is that it >> requires more resources than what Ceph currently has. Harvesting more >> machines, making it possible for people and organizations amicable to >> Ceph to easily donate virtual machines could probably help. > > Zack is making good progress on rejiggering the way that teuthology > separates the core task locking and task runners from the tasks themselves > (which get versioned along with the test suite for firefly, dumpling, > etc.). This is all groundwork to enable the important bits, like pulling > machine locking into a single, easy to deploy process, and plugging in > different providers (in addition to bare metal and downburst) like > OpenStack. The end goal is to make teuthology much easier to deploy in > other environments. I'm hoping we can get to a place similar to openstack > where organizations can hang their CI deployment off the 'upstream' > build/CI infrastructure and supplement by running the same suites on > different hardware or by adding their own test suites... > >> This deserves a separate discussions but I wanted to expand on what I >> meant by "test cycle" and its impact on the release cycle. > > We had a discussion during the G/H CDS about doing an ephemeral > 'integration' branch to group things together for full testing by the > teuthology test suites that you probably caught. There was a follow-on > internal discussion while you were gone on how to get this rolling and Sam > is currently working on a tool to easily build an integration branch > merging pending work on a nightly so that it can go through the tests > before getting merged into master. I think this will help. > > We also have our first batch of new hardware ordered inside Red Hat > (another ~130 machines) that will expand our testing throughput, and > Sandon is working on reclaiming a lot of existing machines that aren't > getting put to good use (burnupi) so that we can expand the size of the > existing test pool. > > Alfredo recently did some background research on what other projects are > doing for CI and releases, and he and Sandon have some work in flight to > move some of the bursty release builds into openstack VMs. Unfortunately > nobody has their full bandwidth allocated to improving the state of > things, but I think we're making some slow progress. > > sage > > >> >> Cheers >> >> On 30/07/2014 05:11, Sage Weil wrote: >>> We've talked a bit about moving to a ~4 month (instead of 3 month) >>> cadence. I'm still inclined in this direction because it means fewer >>> stable releases that we will be maintaining and a longer and (hopefully) >>> more productive interval to do real work in between. >>> >>> The other key point is that we don't want a repeat of the firefly delay. >>> I think we should stay as close to a train model as we can. If something >>> isn't ready by freeze, let it wait for the next cycle. We shouldn't be >>> cramming things in at the end, especially big things. As a general rule, >>> big things should be merged early in the cycle so that we have lots of >>> time to shake out the issues that only come out of lots of testing and >>> aren't obvious from code review. >>> >>> Anyway, how about: >>> >>> Freeze Approx Release >>> Giant Mon Sep 1 Mon Sep 29 >>> Hammer Mon Jan 4 Mon Feb 2 >>> >>> That gives us another month for Giant, then September to shake out >>> anything issues. And then three full months before the Hammer freeze. >>> >>> What say ye? >>> sage >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> >> -- >> Lo?c Dachary, Artisan Logiciel Libre >> >> -- Loïc Dachary, Artisan Logiciel Libre
Attachment:
signature.asc
Description: OpenPGP digital signature