Hi,
We upgraded this morning a Pacific Ceph cluster to the last Quincy version.
The cluster was healthy before the upgrade, everything was done
according to the upgrade procedure (non-cephadm) [1], all services have
restarted correctly but the filesystem switched to read only mode when
it became active.
|
||HEALTH_WARN 1 MDSs are read only||
||[WRN] MDS_READ_ONLY: 1 MDSs are read only||
|| mds.cccephadm32(mds.0): MDS in read-only mode|
This is the only warning we got on the cluster.
In the MDS log, this error "failed to commit dir 0x1 object, errno -22"
seems to be the root cause :
|
||2022-11-23T12:41:09.843+0100 7f930f56d700 -1 log_channel(cluster) log
[ERR] : failed to commit dir 0x1 object, errno -22||
||2022-11-23T12:41:09.843+0100 7f930f56d700 -1 mds.0.11963 unhandled
write error (22) Invalid argument, force readonly...||
||2022-11-23T12:41:09.843+0100 7f930f56d700 1 mds.0.cache force file
system read-only||
||2022-11-23T12:41:09.843+0100 7f930f56d700 0 log_channel(cluster) log
[WRN] : force file system read-only||
||2022-11-23T12:41:09.843+0100 7f930f56d700 10 mds.0.server
force_clients_readonly|
I couldn't get more info with ceph config set mds.x debug_mds 20
|ceph fs status||
||cephfs - 17 clients||
||======||
||RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS ||
|| 0 active cccephadm32 Reqs: 0 /s 12.9k 12.8k 673 1538 ||
|| POOL TYPE USED AVAIL ||
||cephfs_metadata metadata 513G 48.6T ||
|| cephfs_data data 2558M 48.6T ||
|| cephfs_data2 data 471G 48.6T ||
|| cephfs_data3 data 433G 48.6T ||
||STANDBY MDS ||
||cccephadm30 ||
||cccephadm31 ||
||MDS version: ceph version 17.2.5
(98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)|
Any idea what could go wrong and how to solve it before starting a
disaster recovery procedure?
Cheers,
Adrien
[1]
https://ceph.com/en/news/blog/2022/v17-2-0-quincy-released/#upgrading-non-cephadm-clusters
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx