Re: ceph filesystem stuck in read only

Galzin Rémi <rgalzin@xxxxxxxxxx> · Mon, 07 Nov 2022 14:24:35 +0100

Hi Ramana and thank you, 

yes, before the MDS's host reboot the filesystem was read+write and the
cluster was just fine too. We haven't made any upgrade since the cluster
has been installed.
Some times ago i had to rebuild 6 OSDs, due to start failure at boot
time. No more troubles since.

_ What are the outputs of `ceph fs status` and `ceph fs dump`?_ 
root@node3-4:~# ceph fs status
cephfs-ssdrep - 12 clients
=============
RANK  STATE      MDS        ACTIVITY     DNS    INOS   DIRS   CAPS
 0    active  node3-5  Reqs:    0 /s  26.7k  26.7k  6652     36
 1    active   node2-5  Reqs:    0 /s  21.3k  11.0k  1348      5
         POOL             TYPE     USED  AVAIL
cephfs_ssdrep_metadata  metadata   147G  8533G
  cephfs_ssdrep_data      data    1089G  8533G
cephfs-hdd - 14 clients
==========
RANK  STATE     MDS        ACTIVITY     DNS    INOS   DIRS   CAPS
 0    active  node2-3  Reqs:   15 /s  1375k  1375k  6867   1092k
        POOL           TYPE     USED  AVAIL
cephfs_hdd_metadata  metadata  21.7G  8533G
  cephfs_hdd_data      data    1484G  3231G
STANDBY MDS
 node3-3
  node2-4
 node3-4
MDS version: ceph version 16.2.9
(a569859f5e07da0c4c39da81d5fb5675cd95da49) pacific (stable) 

and ceph fs dump : 
root@node3-4:~# ceph fs dump 
e21896
enable_multiple, ever_enabled_multiple: 1,1
default compat: compat={},rocompat={},incompat={1=base v0.20,2=client
writeable ranges,3=default file layouts on dirs,4=dir inode in separate
object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no
anchor table,9=file layout v2,10=snaprealm v2}
legacy client fscid: -1

Filesystem 'cephfs-ssdrep' (6)
fs_name    cephfs-ssdrep
epoch    21896
flags    12
created    2022-08-04T10:26:26.821650+0200
modified    2022-11-07T13:45:37.711273+0100
tableserver    0
root    0
session_timeout    60
session_autoclose    300
max_file_size    1099511627776
required_client_features    {}
last_failure    0
last_failure_osd_epoch    917368
compat    compat={},rocompat={},incompat={1=base v0.20,2=client
writeable ranges,3=default file layouts on dirs,4=dir inode in separate
object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds
uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds    2
in    0,1
up    {0=40733945,1=38718387}
failed    
damaged    
stopped    
data_pools    [28]
metadata_pool    29
inline_data    disabled
balancer    
standby_count_wanted    1
[mds.node3-5{0:40733945} state up:active seq 71 join_fscid=6 addr
[v2:192.168.33.13:6800/976767838,v1:192.168.33.13:6801/976767838] compat
{c=[1],r=[1],i=[7ff]}]
[mds.node2-5{1:38718387} state up:active seq 9 join_fscid=6 addr
[v2:192.168.32.13:6800/155458907,v1:192.168.32.13:6801/155458907] compat
{c=[1],r=[1],i=[7ff]}]

Filesystem 'cephfs-hdd' (8)
fs_name    cephfs-hdd
epoch    21767
flags    12
created    2022-10-25T14:05:21.065421+0200
modified    2022-11-07T08:38:14.283567+0100
tableserver    0
root    0
session_timeout    60
session_autoclose    300
max_file_size    1099511627776
required_client_features    {}
last_failure    0
last_failure_osd_epoch    0
compat    compat={},rocompat={},incompat={1=base v0.20,2=client
writeable ranges,3=default file layouts on dirs,4=dir inode in separate
object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds
uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds    1
in    0
up    {0=32528084}
failed    
damaged    
stopped    1
data_pools    [40]
metadata_pool    41
inline_data    disabled
balancer    
standby_count_wanted    1
[mds.node2-3{0:32528084} state up:active seq 276773 join_fscid=8 addr
[v2:192.168.32.10:6840/1960412605,v1:192.168.32.10:6841/1960412605]
compat {c=[1],r=[1],i=[7ff]}]

Standby daemons:

[mds.node3-3{-1:38291353} state up:standby seq 1 join_fscid=8 addr
[v2:192.168.33.10:6800/2462925236,v1:192.168.33.10:6801/2462925236]
compat {c=[1],r=[1],i=[7ff]}]
[mds.node2-4{-1:38315566} state up:standby seq 1 addr
[v2:192.168.32.12:6800/1553911071,v1:192.168.32.12:6801/1553911071]
compat {c=[1],r=[1],i=[7ff]}]
[mds.node3-4{-1:40440312} state up:standby seq 1 addr
[v2:192.168.33.12:6800/706792986,v1:192.168.33.12:6801/706792986] compat
{c=[1],r=[1],i=[7ff]}] 

I raised the logs verbosity as you recommanded and restarted the MDS. It
is still falling into a "read only" state and i'm now trying to sort
what is relevant or not. 
(or sending all throught the ceph-post-file tool) 

Thanks again for the help.

---

RÉMI GALZIN
Administrateur Système Linux

57 boulevard Malesherbes, 75008 Paris
FIXE : 0173503380 [1]
MAIL : rgalzin@xxxxxxxxxx 
Ce message et ses éventuelles pièces jointes peuvent contenir des
informations confidentielles et sont exclusivement adressées au
destinataire(s) mentionné(s) ci-dessus. Toute diffusion, exploitation ou
copie totale ou partielle, sans autorisation, de ce message et/ou de ses
pièces jointes est strictement interdite. Si vous recevez ce message par
erreur, merci de le détruire et d'avertir immédiatement l'expéditeur.
Les entités du groupe ILIAD et l'expéditeur de ce message déclinent
toute responsabilité si ce message a été modifié ou falsifié et ne
pourront être tenus responsables des éventuels virus ou de son
détournement par un tiers. 

The content of this email and the attached documents are confidential.
They are exclusively addressed to the recipient. If this message is not
intented for you, or if you have received it by mistake, and in order
not to break the confidentiality of correspondence, you must not send it
to someone else nor reproduce it. Please send it back to the sender or
destroy it. Warning : the organization of the sender's message cannot be
held responsible for a potential deformation of this email. It is up to
the recipient to check that the messages and attached documents do not
contain a virus. All opinions contained in this email and its attached
documents are those of the sender. They do no represent the position of
the organization unless it is stated differently in this email. 

Le 2022-11-04 23:10, Ramana Krisna Venkatesh Raja a écrit : 

> On Fri, Nov 4, 2022 at 9:36 AM Galzin Rémi <rgalzin@xxxxxxxxxx> wrote: 
> 
>> Hi,
>> i'm looking for some help/ideas/advices in order to solve the problem
>> that occurs on my metadata
>> server after the server reboot.
> 
> You rebooted a MDS's host and your file system became read-only? Was
> the Ceph cluster healthy before reboot? Any issues with the MDSs,
> OSDs? Did this happen after an upgrade?
> 
>> "Ceph status" warns about my MDS being "read only" but the fileystem and
>> the data seem healthy.
>> It is still possible to access the content of my cephfs volumes since
>> it's read only but i don't know how
>> to make my filesystem writable again.
>> 
>> Logs keeps showing the same error when i restart the MDS server :
>> 
>> 2022-11-04T11:50:14.506+0100 7fbbf83c2700  1 mds.0.6872 handle_mds_map
>> state change up:reconnect --> up:rejoin
>> 2022-11-04T11:50:14.510+0100 7fbbf83c2700  1 mds.0.6872 rejoin_start
>> 2022-11-04T11:50:14.510+0100 7fbbf83c2700  1 mds.0.6872
>> rejoin_joint_start
>> 2022-11-04T11:50:14.702+0100 7fbbf83c2700  1 mds.0.6872 rejoin_done
>> 2022-11-04T11:50:15.546+0100 7fbbf83c2700  1 mds.node3-5 Updating MDS
>> map to version 6881 from mon.3
>> 2022-11-04T11:50:15.546+0100 7fbbf83c2700  1 mds.0.6872 handle_mds_map i
>> am now mds.0.6872
>> 2022-11-04T11:50:15.546+0100 7fbbf83c2700  1 mds.0.6872 handle_mds_map
>> state change up:rejoin --> up:active
>> 2022-11-04T11:50:15.546+0100 7fbbf83c2700  1 mds.0.6872 recovery_done --
>> successful recovery!
>> 2022-11-04T11:50:15.550+0100 7fbbf83c2700  1 mds.0.6872 active_start
>> 2022-11-04T11:50:15.558+0100 7fbbf83c2700  1 mds.0.6872 cluster
>> recovered.
>> 2022-11-04T11:50:18.190+0100 7fbbf5bbd700 -1 mds.pinger is_rank_lagging:
>> rank=0 was never sent ping request.
>> 2022-11-04T11:50:18.190+0100 7fbbf5bbd700 -1 mds.pinger is_rank_lagging:
>> rank=1 was never sent ping request.
>> 2022-11-04T11:50:18.554+0100 7fbbf23b6700  1
>> mds.0.cache.dir(0x1000006cf14) commit error -22 v 1933183
>> 2022-11-04T11:50:18.554+0100 7fbbf23b6700 -1 log_channel(cluster) log
>> [ERR] : failed to commit dir 0x1000006cf14 object, errno -22
>> 2022-11-04T11:50:18.554+0100 7fbbf23b6700 -1 mds.0.6872 unhandled write
>> error (22) Invalid argument, force readonly...
>> 2022-11-04T11:50:18.554+0100 7fbbf23b6700  1 mds.0.cache force file
>> system read-only
> 
> The MDS is unable to write a metadata object to the OSD.  Set
> debug_mds=20 and debug_objecter=20 for the MDS, and capture the MDS
> logs when this happens for more details.
> e.g.,
> $ ceph config set mds.<your-MDS-ID> debug_mds 20
> 
> Also, check the OSD logs when you're hitting this issue.
> 
> You can then reset the MDS log level.  You can share the relevant MDS
> and OSD logs using,
> https://docs.ceph.com/en/pacific/man/8/ceph-post-file/
> 
>> 2022-11-04T11:50:18.554+0100 7fbbf23b6700  0 log_channel(cluster) log
>> [WRN] : force file system read-only
>> 
>> More info:
>> 
>> cluster:
>> id:     f36b996f-221d-4bcb-834b-19fc20bcad6b
>> health: HEALTH_WARN
>> 1 MDSs are read only
>> 1 MDSs behind on trimming
>> 
>> services:
>> mon: 5 daemons, quorum node2-4,node2-5,node3-4,node3-5,node1-1 (age
>> 22h)
>> mgr: node2-4(active, since 28h), standbys: node2-5, node3-4,
>> node3-5, node1-1
>> mds: 3/3 daemons up, 3 standby
>> osd: 112 osds: 112 up (since 22h), 112 in (since 2w)
>> 
>> data:
>> volumes: 2/2 healthy
>> pools:   12 pools, 529 pgs
>> objects: 8.54M objects, 1.9 TiB
>> usage:   7.8 TiB used, 38 TiB / 46 TiB avail
>> pgs:     491 active+clean
>> 29  active+clean+snaptrim
>> 9   active+clean+snaptrim_wait
>> 
>> All MDSs, MONs and OSDs are in version 16.2.9.
> 
> What are the outputs of `ceph fs status` and `ceph fs dump`?
> 
> -Ramana
> 
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

Links:
------
[1] tel:0173503380
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx