i executed the commands from above again ("Recovery from missing metadata objects") and now the mds daemons start. still the same situation like before :( Am 14.01.20 um 22:36 schrieb Oskar Malnowicz: > i just restartet the mds daemons and now they crash during the boot. > > -36> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: > opening inotable > -35> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: > opening sessionmap > -34> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: > opening mds log > -33> 2020-01-14 22:33:17.880 7fc9bbeaa700 5 mds.0.log open > discovering log bounds > -32> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: > opening purge queue (async) > -31> 2020-01-14 22:33:17.880 7fc9bbeaa700 4 mds.0.purge_queue open: > opening > -30> 2020-01-14 22:33:17.880 7fc9bbeaa700 1 mds.0.journaler.pq(ro) > recover start > -29> 2020-01-14 22:33:17.880 7fc9bb6a9700 4 mds.0.journalpointer > Reading journal pointer '400.00000000' > -28> 2020-01-14 22:33:17.880 7fc9bbeaa700 1 mds.0.journaler.pq(ro) > read_head > -27> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: > loading open file table (async) > -26> 2020-01-14 22:33:17.880 7fc9c58a5700 10 monclient: > get_auth_request con 0x55aa83436d80 auth_method 0 > -25> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: > opening snap table > -24> 2020-01-14 22:33:17.884 7fc9c58a5700 10 monclient: > get_auth_request con 0x55aa83437680 auth_method 0 > -23> 2020-01-14 22:33:17.884 7fc9c60a6700 10 monclient: > get_auth_request con 0x55aa83437200 auth_method 0 > -22> 2020-01-14 22:33:17.884 7fc9bceac700 1 mds.0.journaler.pq(ro) > _finish_read_head loghead(trim 805306368, expire 807199928, write > 807199928, stream_format 1). probing for end of log (from 807199928)... > -21> 2020-01-14 22:33:17.884 7fc9bceac700 1 mds.0.journaler.pq(ro) > probing for end of the log > -20> 2020-01-14 22:33:17.884 7fc9bb6a9700 1 > mds.0.journaler.mdlog(ro) recover start > -19> 2020-01-14 22:33:17.884 7fc9bb6a9700 1 > mds.0.journaler.mdlog(ro) read_head > -18> 2020-01-14 22:33:17.884 7fc9bb6a9700 4 mds.0.log Waiting for > journal 0x200 to recover... > -17> 2020-01-14 22:33:17.884 7fc9c68a7700 10 monclient: > get_auth_request con 0x55aa83437f80 auth_method 0 > -16> 2020-01-14 22:33:17.884 7fc9c60a6700 10 monclient: > get_auth_request con 0x55aa83438400 auth_method 0 > -15> 2020-01-14 22:33:17.892 7fc9bbeaa700 1 > mds.0.journaler.mdlog(ro) _finish_read_head loghead(trim 98280931328, > expire 98282151365, write 98282247624, stream_format 1). probing for > end of log (from 98282247624)... > -14> 2020-01-14 22:33:17.892 7fc9bbeaa700 1 > mds.0.journaler.mdlog(ro) probing for end of the log > -13> 2020-01-14 22:33:17.892 7fc9bceac700 1 mds.0.journaler.pq(ro) > _finish_probe_end write_pos = 807199928 (header had 807199928). recovered. > -12> 2020-01-14 22:33:17.892 7fc9bceac700 4 mds.0.purge_queue > operator(): open complete > -11> 2020-01-14 22:33:17.892 7fc9bceac700 1 mds.0.journaler.pq(ro) > set_writeable > -10> 2020-01-14 22:33:17.892 7fc9bbeaa700 1 > mds.0.journaler.mdlog(ro) _finish_probe_end write_pos = 98283021535 > (header had 98282247624). recovered. > -9> 2020-01-14 22:33:17.892 7fc9bb6a9700 4 mds.0.log Journal 0x200 > recovered. > -8> 2020-01-14 22:33:17.892 7fc9bb6a9700 4 mds.0.log Recovered > journal 0x200 in format 1 > -7> 2020-01-14 22:33:17.892 7fc9bb6a9700 2 mds.0.13470 Booting: 1: > loading/discovering base inodes > -6> 2020-01-14 22:33:17.892 7fc9bb6a9700 0 mds.0.cache creating > system inode with ino:0x100 > -5> 2020-01-14 22:33:17.892 7fc9bb6a9700 0 mds.0.cache creating > system inode with ino:0x1 > -4> 2020-01-14 22:33:17.896 7fc9bbeaa700 2 mds.0.13470 Booting: 2: > replaying mds log > -3> 2020-01-14 22:33:17.896 7fc9bbeaa700 2 mds.0.13470 Booting: 2: > waiting for purge queue recovered > -2> 2020-01-14 22:33:17.908 7fc9ba6a7700 -1 log_channel(cluster) log > [ERR] : ESession.replay sessionmap v 7561128 - 1 > table 0 > -1> 2020-01-14 22:33:17.912 7fc9ba6a7700 -1 > /build/ceph-14.2.5/src/mds/journal.cc: In function 'virtual void > ESession::replay(MDSRank*)' thread 7fc9ba6a7700 time 2020-01-14 > 22:33:17.912135 > /build/ceph-14.2.5/src/mds/journal.cc: 1655: FAILED > ceph_assert(g_conf()->mds_wipe_sessions) > > Am 14.01.20 um 21:19 schrieb Oskar Malnowicz: >> this was the new state. the results are equal to florians >> >> $ time cephfs-data-scan scan_extents cephfs_data >> cephfs-data-scan scan_extents cephfs_data 1.86s user 1.47s system 21% >> cpu 15.397 total >> >> $ time cephfs-data-scan scan_inodes cephfs_data >> cephfs-data-scan scan_inodes cephfs_data 2.76s user 2.05s system 26% >> cpu 17.912 total >> >> $ time cephfs-data-scan scan_links >> cephfs-data-scan scan_links 0.13s user 0.11s system 31% cpu 0.747 total >> >> $ time cephfs-data-scan scan_links >> cephfs-data-scan scan_links 0.13s user 0.12s system 33% cpu 0.735 total >> >> $ time cephfs-data-scan cleanup cephfs_data >> cephfs-data-scan cleanup cephfs_data 1.64s user 1.37s system 12% cpu >> 23.922 total >> >> mds / $ du -sh >> 31G >> >> $ df -h >> ip1,ip2,ip3:/ 5.2T 2.1T 3.1T 41% /storage/cephfs_test1 >> >> $ ceph df detail >> RAW STORAGE: >> CLASS SIZE AVAIL USED RAW USED %RAW USED >> hdd 7.8 TiB 7.5 TiB 312 GiB 329 GiB 4.14 >> TOTAL 7.8 TiB 7.5 TiB 312 GiB 329 GiB 4.14 >> >> POOLS: >> POOL ID STORED OBJECTS USED >> %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY >> USED COMPR UNDER COMPR >> cephfs_data 6 2.1 TiB 2.48M 2.1 TiB >> 25.00 3.1 TiB N/A N/A >> 2.48M 0 B 0 B >> cephfs_metadata 7 7.3 MiB 379 7.3 MiB >> 0 3.1 TiB N/A N/A 379 >> 0 B 0 B >> >> >> Am 14.01.20 um 21:06 schrieb Patrick Donnelly: >>> I'm asking that you get the new state of the file system tree after >>> recovering from the data pool. Florian wrote that before I asked you >>> to do this... >>> >>> How long did it take to run the cephfs-data-scan commands? >>> >>> On Tue, Jan 14, 2020 at 11:58 AM Oskar Malnowicz >>> <oskar.malnowicz@xxxxxxxxxxxxxx> wrote: >>>> as florian already wrote, `du -hc` shows a total usage of 31G, but `ceph >>>> df` show us an usage of 2.1 >>>> >>>> </ mds># du -hs >>>> 31G >>>> >>>> # ceph df >>>> cephfs_data 6 2.1 TiB 2.48M 2.1 TiB 25.00 3.1 TiB >>>> >>>> Am 14.01.20 um 20:44 schrieb Patrick Donnelly: >>>>> On Tue, Jan 14, 2020 at 11:40 AM Oskar Malnowicz >>>>> <oskar.malnowicz@xxxxxxxxxxxxxx> wrote: >>>>>> i run this commands, but still the same problems >>>>> Which problems? >>>>> >>>>>> $ cephfs-data-scan scan_extents cephfs_data >>>>>> >>>>>> $ cephfs-data-scan scan_inodes cephfs_data >>>>>> >>>>>> $ cephfs-data-scan scan_links >>>>>> 2020-01-14 20:36:45.110 7ff24200ef80 -1 mds.0.snap updating last_snap 1 >>>>>> -> 27 >>>>>> >>>>>> $ cephfs-data-scan cleanup cephfs_data >>>>>> >>>>>> do you have other ideas ? >>>>> After you complete this, you should see the deleted files in your file >>>>> system tree (if this is indeed the issue). What's the output of `du >>>>> -hc`? >>>>> >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx