Re: ESC meeting minutes: 2024-03-07

Thorsten Behrens <thb@xxxxxxxxxxxxxxx> · Mon, 11 Mar 2024 15:46:51 +0100

Hi y'all,

Miklos Vajna wrote:
> On Fri, Mar 08, 2024 at 12:46:47PM +0100, Stephan Bergmann <stephan.bergmann@xxxxxxxxxxxxx> wrote:
> > Is there any documentation, or anybody able to explain at an adequate level,
> > what is taken into consideration when making those decisions?
> > 
>
Please see https://baolef.github.io/libreoffice-ci/

> > Wondering when I see Gerrit changes like
> > <https://gerrit.libreoffice.org/c/core/+/164554> "Add Embind'ing of UNO Any
> > getter for interfaces", which didn't touch any file that would actually be
> > used by any of the <https://ci.libreoffice.org/job/gerrit_master_ml/>
> > builds, nevertheless getting channeled through the sequential build.
> 
> My understanding is that it simply looks at what files are touched by
> the gerrit change, has knowledge of what was the 'touched files -> build
> result' connection in the past and tries to guess based on that.
> 
Almost. There's code running over a gerrit commit json export,
extracting 'features' from every commit
(https://github.com/baolef/libreoffice-ci/blob/data/dataset/mining.py). That's
actually looking at much more than just the files touched (though we
did exclude committer/author names, for obvious reasons). Those
feature vectors are then used to train a machine learning model,
combined with the historical CI results of those said commits.

There's no clear causality here, as with any large neural network
training, just statistics and likelihoods. When we had this project
last year, we had anticipated regular needs for re-runs of the
training, since code, tests, and also CI behaviours drift over time.

Cheers,

-- Thorsten
Attachment:
signature.asc

Description: PGP signature