GSoC'23 - Week 3 Report

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



---
title: "Week 3"
date: 2023-06-15
---

### Model training

Basic [model training](https://github.com/baolef/libreoffice-ci/blob/main/train.py) pipeline is completed with [testselect](https://github.com/baolef/libreoffice-ci/blob/main/models/testselect.py) model. Further optimization is needed to reduce memory and time cost, together with performance.

Currently, [testselect](https://github.com/baolef/libreoffice-ci/blob/main/models/testselect.py) is trained on a subset of size 16384 (containing training and testing set) of the full dataset of size 122019 due to memory cost, and it has reached a failure recall of 91.4% and saving 90% of unit test computational cost. Its detailed confusion matrix is shown below:

|               | Fail (Predicted) | Pass (Predicted) |
|---------------|------------------|------------------|
| Fail (Actual) | 480              | 45               |
| Pass (Actual) | 556910           | 5045893          |

[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux