GSoC'23 -- Week 10

Baole Fang <baole.fang@xxxxxxxxx> · Thu, 10 Aug 2023 11:22:21 -0400

---
title: "Week 10"
date: 2023-08-03
---

### Smart inference

Previously, jenkins only uses [testfailure](https://github.com/baolef/libreoffice-ci/blob/main/models/testfailure.py) results to decide whether the patch will pass or fail. Since it is not very accurate and [testselect](https://github.com/baolef/libreoffice-ci/blob/main/models/testselect.py) is accurate, a better algorithm using its prediction is used to pass or fail a patch.

[testoverall](https://github.com/baolef/libreoffice-ci/blob/main/models/testoverall.py) is proposed to integrate [testselect's](https://github.com/baolef/libreoffice-ci/blob/main/models/testselect.py) predictions into [testfailure](https://github.com/baolef/libreoffice-ci/blob/main/models/testfailure.py). Compared to [testfailure](https://github.com/baolef/libreoffice-ci/blob/main/models/testfailure.py), its failure recall significantly increases from 54% to 71%, while pass recall slightly drops from 70% to 65%. Since failure recall is much more important than pass recall, the model is a huge improvement.

Due to [testoverall](https://github.com/baolef/libreoffice-ci/blob/main/models/testoverall.py) outstanding performance, it replaces [testfailure](https://github.com/baolef/libreoffice-ci/blob/main/models/testfailure.py) in inference.

Besides, a new condition is added to decide whether the patch should pass or fail. Originally, it only looks at whether the overall failing probability has reached a threshold (0.4). Now, the number of failed unit tests are counted. If it reaches the threshold (10), then the patch is also considered to be failed. With the improved algorithm, the inference is able to recall 91% failures, while reducing computation by 57%.

### Jenkins integration

Currently, the model is integrated into Jenkins job [gerrit_master_ml](https://ci.libreoffice.org/job/gerrit_master_ml/). It first runs the machine learning model to predict whether the patch will pass or fail. If the patch is likely to fail, then the [fast track](https://ci.libreoffice.org/job/gerrit_master_seq/) will be run. If it is likely to fail, then the [normal build](https://ci.libreoffice.org/job/gerrit_master/) will be run.