Hello,
I've encountered an issue where the metadata pool has corrupted a cache
inode, leading to an MDS rank abort in the 'reconnect' state. To address
this, I'm following the "USING AN ALTERNATE METADATA POOL FOR RECOVERY"
section from the documentation [1].
However, I've observed that the cephfs-data-scan scan_links step has
been running for over 24 hours on 35 TB of data, which is replicated
across 3 OSDs, resulting in more than 100 TB of raw data. Does anyone
have an estimation on the duration for this step?
Additional detail: The corrupted mds log:
-9> 2023-10-11T10:13:22.254-0300 7ff901f75700 10 monclient:
get_auth_request con 0x559bf41e4400 auth_method 0
-8> 2023-10-11T10:13:22.254-0300 7ff8ff770700 5 mds.barril12
handle_mds_map old map epoch 472481 <= 472481, discarding
-7> 2023-10-11T10:13:22.254-0300 7ff8ff770700 0 mds.0.cache
missing dir for * (which maps to *) on [inode 0x10021afaf90
[...392,head] /dbteamvenv/ auth v98534854 snaprealm=0x559bf427ce00 f(v60
m2023-10-06T15:35:03.278089-0300 9=0+9) n(v141971
rc2023-10-09T18:41:19.742089-0300 b1424948533453 139810=131460+8350)
(iversion lock) 0x559bf4298580]
-6> 2023-10-11T10:13:22.254-0300 7ff8ff770700 0 mds.0.cache
missing dir ino 0x20005dd786b
-5> 2023-10-11T10:13:22.254-0300 7ff902776700 10 monclient:
get_auth_request con 0x559bf4142c00 auth_method 0
-4> 2023-10-11T10:13:22.258-0300 7ff902f77700 5 mds.beacon.barril12
received beacon reply up:rejoin seq 4 rtt 1.09601
-3> 2023-10-11T10:13:22.258-0300 7ff8ff770700 -1
./src/mds/MDCache.cc: In function 'void
MDCache::handle_cache_rejoin_weak(ceph::cref_t<MMDSCacheRejoin>&)'
thread 7ff8ff770700 time 2023-10-11T10:13:22.259535-0300
./src/mds/MDCache.cc: 4462: FAILED ceph_assert(diri)
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy
(stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x124) [0x7ff904a5b282]
2: /usr/lib/ceph/libceph-common.so.2(+0x25b420) [0x7ff904a5b420]
3:
(MDCache::handle_cache_rejoin_weak(boost::intrusive_ptr<MMDSCacheRejoin
const> const&)+0x20de) [0x559bf0a9da6e]
4: (MDCache::dispatch(boost::intrusive_ptr<Message const>
const&)+0x424) [0x559bf0aa2a64]
5: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&,
bool)+0x5c0) [0x559bf0930580]
6: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const>
const&)+0x58) [0x559bf0930b78]
7: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message>
const&)+0x1bf) [0x559bf090b5df]
8: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message>
const&)+0x468) [0x7ff904ca71d8]
9: (DispatchQueue::entry()+0x5ef) [0x7ff904ca48df]
10: (DispatchQueue::DispatchThread::entry()+0xd) [0x7ff904d681cd]
11: /lib/x86_64-linux-gnu/libpthread.so.0(+0x7ea7) [0x7ff905680ea7]
12: clone()
-2> 2023-10-11T10:13:22.258-0300 7ff902f77700 10 monclient:
get_auth_request con 0x559bf41e4c00 auth_method 0
-1> 2023-10-11T10:13:22.258-0300 7ff902f77700 10 monclient:
get_auth_request con 0x559bf41e5400 auth_method 0
0> 2023-10-11T10:13:22.262-0300 7ff8ff770700 -1 *** Caught signal
(Aborted) **
in thread 7ff8ff770700 thread_name:ms_dispatch
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy
(stable)
1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x13140) [0x7ff90568c140]
2: gsignal()
3: abort()
4: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x17e) [0x7ff904a5b2dc]
5: /usr/lib/ceph/libceph-common.so.2(+0x25b420) [0x7ff904a5b420]
6:
(MDCache::handle_cache_rejoin_weak(boost::intrusive_ptr<MMDSCacheRejoin
const> const&)+0x20de) [0x559bf0a9da6e]
7: (MDCache::dispatch(boost::intrusive_ptr<Message const>
const&)+0x424) [0x559bf0aa2a64]
8: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&,
bool)+0x5c0) [0x559bf0930580]
9: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const>
const&)+0x58) [0x559bf0930b78]
10: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message>
const&)+0x1bf) [0x559bf090b5df]
11: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message>
const&)+0x468) [0x7ff904ca71d8]
12: (DispatchQueue::entry()+0x5ef) [0x7ff904ca48df]
13: (DispatchQueue::DispatchThread::entry()+0xd) [0x7ff904d681cd]
14: /lib/x86_64-linux-gnu/libpthread.so.0(+0x7ea7) [0x7ff905680ea7]
15: clone()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
Ceph Cluster status:
barril1:~# ceph status
cluster:
id: c30ecc8d-440e-4608-b3fe-5020337ae11d
health: HEALTH_ERR
2 filesystems are degraded
2 filesystems are offline
services:
mon: 5 daemons, quorum barril4,barril3,barril2,barril1,urquell (age
32h)
mgr: barril2(active, since 32h), standbys: barril3, barril4,
urquell, barril1
mds: 0/10 daemons up (10 failed), 9 standby
osd: 48 osds: 48 up (since 32h), 48 in (since 2M); 22 remapped pgs
rgw: 4 daemons active (4 hosts, 1 zones)
data:
volumes: 0/2 healthy, 2 failed
pools: 12 pools, 1475 pgs
objects: 50.89M objects, 72 TiB
usage: 207 TiB used, 148 TiB / 355 TiB avail
pgs: 579358/152674596 objects misplaced (0.379%)
1449 active+clean
22 active+remapped+backfilling
4 active+clean+scrubbing+deep
io:
client: 7.2 MiB/s rd, 1.2 MiB/s wr, 342 op/s rd, 367 op/s wr
recovery: 26 MiB/s, 13 keys/s, 26 objects/s
progress:
Global Recovery Event (19h)
[===========================.] (remaining: 17m)
Ceph fs status:
barril1:~# ceph fs status
cephfs - 0 clients
======
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 failed
1 failed
2 failed
3 failed
4 failed
5 failed
6 failed
7 failed
8 failed
POOL TYPE USED AVAIL
cephfs_metadata metadata 1045G 35.6T
cephfs.c3sl.data data 114T 35.6T
c3sl - 0 clients
====
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 failed
POOL TYPE USED AVAIL
cephfs.c3sl.meta metadata 28.2G 35.6T
cephfs.c3sl.data data 114T 35.6T
STANDBY MDS
barril2
barril4
barril42
barril33
barril13
barril23
barril43
barril1
barril12
MDS version: ceph version 17.2.6
(d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
ceph health detail:
barril1:~# ceph health detail
HEALTH_ERR 2 filesystems are degraded; 2 filesystems are offline
[WRN] FS_DEGRADED: 2 filesystems are degraded
fs cephfs is degraded
fs c3sl is degraded
[ERR] MDS_ALL_DOWN: 2 filesystems are offline
fs cephfs is offline because no MDS is active for it.
fs c3sl is offline because no MDS is active for it.
[1]:
https://docs.ceph.com/en/reef/cephfs/disaster-recovery-experts/#using-an-alternate-metadata-pool-for-recovery
Best regards,
Odair M. Ditkun Jr
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx