Upgrade from 12.2.1 to 12.2.2 broke my CephFs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi there,

I'm running a CEPH cluster for some libvirt VMs and a CephFS providing /home to ~20 desktop machines. There are 4 Hosts running 4 MONs, 4MGRs, 3MDSs (1 active, 2 standby) and 28 OSDs in total. This cluster is up and running since the days of Bobtail (yes, including CephFS).

Now with update from 12.2.1 to 12.2.2 on last friday afternoon I restarted MONs, MGRs, OSDs as usual. RBD is running just fine. But after trying to restart MDSs they tried replaying journal then fell back to standby and FS was in state "damaged". I finally got them back working after I did a good portion of whats described here:

http://docs.ceph.com/docs/master/cephfs/disaster-recovery/

Now when all clients are shut down I can start MDS, will replay and become active. I then can mount CephFS on a client and can access my files and folders. But the more clients I bring up MDS will first report damaged metadata (probably due to some damaged paths, I could live with that) and then MDS will fail with assert:

/build/ceph-12.2.2/src/mds/MDCache.cc: 258: FAILED assert(inode_map.count(in->vino()) == 0)

I tried doing an online CephFS scrub like

ceph daemon mds.a scrub_path / recursive repair

This will run for couple of hours, always finding exactly 10001 damages of type "backtrace" and reporting it would be fixing loads of erronously free-marked inodes until MDS crashes. When I rerun that scrub after having killed all clients and restarted MDSs things will repeat finding exactly those 10001 damages and it will begin fixing those exactly same free-marked inodes over again.

Btw. CephFS has about 3 million objects in metadata pool. Data pool is about 30 million objects with ~2.5TB * 3 replicas.

What I tried next is keeping MDS down and doing

cephfs-data-scan scan_extents <data pool>
cephfs-data-scan scan_inodes <data pool>
cephfs-data-scan scan_links

As this is described to take "a very long time" this is what I initially skipped from disater-recovery tips. Right now I'm still on first step with 6 workers on a single host busy doing
cephfs-data-scan scan_extents. ceph -s shows me client io of 20kB/s (!!!). If thats real scan speed this is going to take ages.
Any way to tell how long this is going to take? Could I speed things up by running more workers on multiple hosts simultaneously?
Should I abort it as I actually don't have the problem of lost files. Maybe running
cephfs-data-scan scan_links would better suit my issue, or does scan_extents/scan_indoes HAVE to be run and finished first?

I have to get this cluster up and running again as soon as possible. Any help highly appreciated. If there is anything I can help, e.g. with further information, feel free to ask. I'll try to hang around on #ceph (nick topro/topro_/topro__). FYI, I'm in Central Europe TimeZone (UTC+1).

Thank you so much!

Best regards,
Tobi

-- 
-----------------------------------------------------------
Dipl.-Inf. (FH) Tobias Prousa
Leiter Entwicklung Datenlogger

CAETEC GmbH
Industriestr. 1
D-82140 Olching
www.caetec.de

Gesellschaft mit beschränkter Haftung
Sitz der Gesellschaft: Olching
Handelsregister: Amtsgericht München, HRB 183929
Geschäftsführung: Stephan Bacher, Andreas Wocke

Tel.: +49 (0)8142 / 50 13 60
Fax.: +49 (0)8142 / 50 13 69

eMail: tobias.prousa@xxxxxxxxx
Web:   http://www.caetec.de
------------------------------------------------------------
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux