Re: getting ready for jewel 10.2.1

John Spray <jspray@xxxxxxxxxx> · Thu, 31 Mar 2016 16:49:24 +0100

On Wed, Mar 30, 2016 at 11:30 AM, Loic Dachary <loic@xxxxxxxxxxx> wrote:
> Hi,
>
> Now is a good time to get ready for jewel 10.2.1 and I created http://tracker.ceph.com/issues/15317 for that purpose. The goal is to be able to run as many suites as possible on OpenStack, so that we do not have to wait days (sometime a week) for runs to complete on Sepia. Best case scenario, all OpenStack specific problems are fixed by the time 10.2.1 is being prepared. Worst case scenario there is no time to fix issues and we keep using the sepia lab. I guess we'll end up somewhere in the middle : some suites will run fine on Openstack and we'll use sepia for others.
>
> In a previous mail I voiced my concerns regarding the lack of interest of developers regarding teuthology job failures that are cause by variations in the infrastructure. I still have no clue how to convey my belief that it is important for teuthology jobs to succeed despite infrastructure variations. But instead of just giving up and do nothing, I will work on that for the rados suite and hope things will evolve in a good way. To be honest, figuring out http://tracker.ceph.com/issues/15236 and seeing a good run of the rados suite on jewel as a result renewed my motivation in that area :-)

If I was dedicating time to working on lab infrastructure, I think I
would prioritise stabilising the existing sepia lab.  I still see
infrastructure issues (these day usually package install failures)
sprinkled all over the place, so I have to question the value of
spreading ourselves even more thinly by trying to handle multiple
environments with their different quirks.

I have nothing against the openstack work, it is a good tool, but I
don't think it was wise to just deploy it and expect other developers
to handle the issues.  I would have liked to see at least one passing
filesystem run on openstack before regular nightlies were scheduled on
it.  Maybe now that we have fixes for #13980 and #13876 we will see a
passing run, and can get more of a sense of how stable/unstable these
tests are in the openstack environment: I think it's likely that we
will continue to see timeouts/instability from the comparatively
underpowered nodes.

John

> Cheers
>
> --
> Loïc Dachary, Artisan Logiciel Libre
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html