Re: 18.2.4 regression: 'diskprediction_local' has failed: No module named 'sklearn'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On 2024-07-25 16:39, Harry G Coin wrote:
Upgraded to 18.2.4 yesterday.  Healthy cluster reported a few minutes after the upgrade completed.  Next morning, this:

# ceph health detail
HEALTH_ERR Module 'diskprediction_local' has failed: No module named 'sklearn' [ERR] MGR_MODULE_ERROR: Module 'diskprediction_local' has failed: No module named 'sklearn'
   Module 'diskprediction_local' has failed: No module named 'sklearn'


Searching found this was a problem several years ago, then resolved, now returned.

We encountered the same problem after an upgrade on our cluster and I dug a bit into this. It appears that [0] was the fix for the missing sklearn package back in 2021. That fix was seemingly specifically tied to centos 8.

Now that the container images are being built on centos 9, the relevant Dockerfile doesn't include the fix any more as it checks the OS version for centos 8. I wonder a bit why it was done this way.

That problem in relation to centos 9 seems to be known to the ceph-container managers. See for example [1].

[0] https://github.com/ceph/ceph-container/pull/1821/files
[1] https://github.com/ceph/ceph-container/blob/main/ceph-releases/ALL/centos/9/daemon-base/README.tmp

Best regards,
Rouven
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux