Re: Update about Autocloud deployment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/07/16, Adam Williamson wrote:
> On Mon, 2016-07-11 at 04:05 +0000, Kushal Das wrote:
> > 
> > ## fedmsg-hub on the backends double enqueue
> > 
> > We had to restart fedmsg-hubs few times in the backends servers. In
> > between we also found that fedmsg-hub service was happily enqueueing the
> > jobs twice (for each compose), and then it got fixed automagically,
> > nothing was changed in our configuration or in code.
> > 
> > We are still not sure why this happened, but we are trying to dig more
> > on this.
> 
> I had some issues with the openQA and check-compose consumers too,
> after upgrading to F24; after poking it a bit with Ralph we concluded
> the python-twisted then in stable was causing issues with fedmsg,
> fedmsg seemed to be doing the right thing but twisted was eating
> messages and stuff. The twisted then in updates-testing - 16.2.0-2.fc24 
> - seemed to make things better, and it's now gone stable. So this might
> have got fixed by that update, if you updated the boxes.
> 
> > ## fedmsg-hub broke due to a faulty dependency
> > 
> > Even though we kept our code up and running for weeks, after the
> > production deployment we found one of the dependency (fedfind, adamw is
> > the upstream author) was broken with the fedora atomic image names, and
> > causing our fedmsg-hub instances go crazy. We have informed upstream,
> > and got a quick hotfix deployment in few hours after finding the issue.
> > 
> > For the next release we will make sure if keep it running for longer on
> > our internal hardware with messages from production fedmsg. This
> > dependency failure was something we should have caught, but could not.
> 
> So a bit of background here...Pungi/productmd compose IDs look like this:
> 
> (DISTRONAME)-(RELEASE)-(DATE).(TYPE).(RESPIN)
> 
> e.g.: Fedora-24-20160711.n.0 , where Fedora is the 'distro name', 24 is
> the release, 20160711 is the date, 'n' is the type (indicates
> 'nightly'), and 0 is the respin.
> 
> fedfind needs to parse all the bits out of the compose ID for various
> purposes, so I had some code for parsing compose IDs which naturally
> enough used the '-' separators to split the distro name from the
> release and the date. This worked fine up till recently, when releng
> started doing the 'two-week Atomic' composes - composes of the Atomic
> (and some Cloud) images for the latest stable release (so 24 at
> present) done nightly - in Pungi 4.
> 
> Unfortunately, when they did that, they decided to use 'Fedora-Atomic'
> as the 'distro name' for these composes. Their compose IDs look like
> this: Fedora-Atomic-24-20160711.0 (they use the 'production' compose
> type, where the 'type' identifier is omitted).
> 
> You can probably guess what a 'distro name' with a - in it does to a
> parser which is trying to split fields on that character :/
> 
> I actually knew about this weeks ago, but I thought I knew all the
> important users of fedfind and it wasn't really causing any fatal
> consequences for any of them (because none of the others actually need
> to do anything with those Fedora-Atomic composes, so the fact that they
> all got completely confused by such composes and refused to do anything
> with them was just fine), so it wasn't a big priority for me to fix it.
> I didn't realize you were using this codepath in fedfind for the
> autocloud test triggers, so sorry about that!
> 
You helped us to have the hotfix ready, and also made a new release of
the upstream package. That was an amazing help, thank you once again for
that :)

Kushal
-- 
Fedora Cloud Engineer
CPython Core Developer
https://kushaldas.in
https://dgplug.org
--
test mailing list
test@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe:
https://lists.fedoraproject.org/admin/lists/test@xxxxxxxxxxxxxxxxxxxxxxx




[Index of Archives]     [Fedora Desktop]     [Fedora SELinux]     [Photo Sharing]     [Yosemite Forum]     [KDE Users]

  Powered by Linux