MDS daemons crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello all! Hope that anybody can help us.

 

The initial point:  Ceph cluster v15.2 (installed and controlled by the Proxmox) with 3 nodes based on physical servers rented from a cloud provider. The volumes provided by Ceph using CephFS and RBD also. We run 2 MDS daemons but use max_mds=1 so one daemon was in active state, and another in standby.

On Thursday some of the applications stopped working. After the investigation it was clear that we have a problem with Ceph, more precisely with СephFS – both MDS daemons suddenly crashed. We tried to restart them and found that they crashed again immediately after the start. The crash information:

 

2024-04-17T17:47:42.841+0000 7f959ced9700  1 mds.0.29134 recovery_done -- successful recovery!

2024-04-17T17:47:42.853+0000 7f959ced9700  1 mds.0.29134 active_start

2024-04-17T17:47:42.881+0000 7f959ced9700  1 mds.0.29134 cluster recovered.

2024-04-17T17:47:43.825+0000 7f959aed5700 -1 ./src/mds/OpenFileTable.cc: In function 'void OpenFileTable::commit(MDSContext*, uint64_t, int)' thread 7f959aed5700 time 2024-04-17T17:47:43.831243+0000

./src/mds/OpenFileTable.cc: 549: FAILED ceph_assert(count > 0)

 

Next hours we read tons of articles, studied the documentation, and checked the cluster status in general by the various diagnostic commands – but didn’t find anything wrong. At evening we decided to upgrade our Ceph cluster; so, we upgraded it to v16, and finally to v17.2.7. Unfortunately, it didn’t solve the problem, MDS continue to crash with the same error. The only difference that we found is the “1 MDSs report damaged metadata” in the output of ceph -s – see it below.

 

I supposed that it may be the well-known bug, but couldn’t find the same one on https://tracker.ceph.com - there are several bugs associated with file OpenFileTable.cc but not related to ceph_assert(count > 0)

We tried to check the source code of OpenFileTable.cc also, here is a fragment of it, in function OpenFileTable::_journal_finish

      int omap_idx = anchor.omap_idx;

      unsigned& count = omap_num_items.at(omap_idx);

      ceph_assert(count > 0);

 

So, we guess that the object map is empty for some object in Ceph, and it is unexpected behavior. But again, we found nothing wrong in our cluster…

 

Next, we started with https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/ article – tried to reset the journal (despite that it was Ok all the time) and wipe the sessions using cephfs-table-tool all reset session command. No result…

Now I decided to continue following this article and run cephfs-data-scan scan_extents command, we started it on Friday but it is still working (2 from 3 workers finished, so I’m waiting for the last one; may be I need more workers for the next command cephfs-data-scan scan_inodes that I plan to run ). But I have a doubt that it will solve the issue because, again, we guess that we have no problem with our objects in Ceph but with metadata only…

 

Is it the new bug? or something else? What should we try additionally to run our MDS daemon? Any idea is welcome!

 

The important outputs:

ceph -s

  cluster:

    id:     4cd1c477-c8d0-4855-a1f1-cb71d89427ed

    health: HEALTH_ERR

            1 MDSs report damaged metadata

            insufficient standby MDS daemons available

            83 daemons have recently crashed

            3 mgr modules have recently crashed

 

  services:

    mon: 3 daemons, quorum asrv-dev-stor-2,asrv-dev-stor-3,asrv-dev-stor-1 (age 22h)

    mgr: asrv-dev-stor-2(active, since 22h), standbys: asrv-dev-stor-1

    mds: 1/1 daemons up

    osd: 18 osds: 18 up (since 22h), 18 in (since 29h)

 

  data:

    volumes: 1/1 healthy

    pools:   5 pools, 289 pgs

    objects: 29.72M objects, 5.6 TiB

    usage:   21 TiB used, 47 TiB / 68 TiB avail

    pgs:     287 active+clean

             2   active+clean+scrubbing+deep

 

  io:

    client:   2.5 KiB/s rd, 172 KiB/s wr, 261 op/s rd, 195 op/s wr

 

ceph fs dump

e29480

enable_multiple, ever_enabled_multiple: 0,1

default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}

legacy client fscid: 1

 

Filesystem 'cephfs' (1)

fs_name cephfs

epoch   29480

flags   12 joinable allow_snaps allow_multimds_snaps

created 2022-11-25T15:56:08.507407+0000

modified        2024-04-18T16:52:29.970504+0000

tableserver     0

root    0

session_timeout 60

session_autoclose       300

max_file_size   1099511627776

required_client_features        {}

last_failure    0

last_failure_osd_epoch  14728

compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}

max_mds 1

in      0

up      {0=156636152}

failed

damaged

stopped

data_pools      [5]

metadata_pool   6

inline_data     disabled

balancer

standby_count_wanted    1

[mds.asrv-dev-stor-1{0:156636152} state up:active seq 6 laggy since 2024-04-18T16:52:29.970479+0000 addr [v2:172.22.2.91:6800/2487054023,v1:172.22.2.91:6801/2487054023] compat {c=[1],r=[1],i=[7ff]}]

 

cephfs-journal-tool --rank=cephfs:0 journal inspect

Overall journal integrity: OK

 

ceph pg dump summary

version 41137

stamp 2024-04-18T21:17:59.133536+0000

last_osdmap_epoch 0

last_pg_scan 0

PG_STAT  OBJECTS   MISSING_ON_PRIMARY  DEGRADED  MISPLACED  UNFOUND  BYTES          OMAP_BYTES*  OMAP_KEYS*  LOG      DISK_LOG

sum      29717605                   0         0          0        0  6112544251872  13374192956    28493480  1806575   1806575

OSD_STAT  USED    AVAIL   USED_RAW  TOTAL

sum       21 TiB  47 TiB    21 TiB  68 TiB

 

ceph pg dump pools

POOLID  OBJECTS   MISSING_ON_PRIMARY  DEGRADED  MISPLACED  UNFOUND  BYTES          OMAP_BYTES*  OMAP_KEYS*  LOG     DISK_LOG

8          31771                   0         0          0        0   131337887503         2482         140  401246    401246

7         839707                   0         0          0        0  3519034650971          736          61  399328    399328

6        1319576                   0         0          0        0      421044421  13374189738    28493279  206749    206749

5       27526539                   0         0          0        0  2461702171417            0           0  792165    792165

2             12                   0         0          0        0       48497560            0           0    6991      6991

 

 

---

Best regards,

 

Alexey Gerasimov

System Manager

 

www.opencascade.com

www.capgemini.com

 

 

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux