i just restartet the mds daemons and now they crash during the boot. -36> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: opening inotable -35> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: opening sessionmap -34> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: opening mds log -33> 2020-01-14 22:33:17.880 7fc9bbeaa700 5 mds.0.log open discovering log bounds -32> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: opening purge queue (async) -31> 2020-01-14 22:33:17.880 7fc9bbeaa700 4 mds.0.purge_queue open: opening -30> 2020-01-14 22:33:17.880 7fc9bbeaa700 1 mds.0.journaler.pq(ro) recover start -29> 2020-01-14 22:33:17.880 7fc9bb6a9700 4 mds.0.journalpointer Reading journal pointer '400.00000000' -28> 2020-01-14 22:33:17.880 7fc9bbeaa700 1 mds.0.journaler.pq(ro) read_head -27> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: loading open file table (async) -26> 2020-01-14 22:33:17.880 7fc9c58a5700 10 monclient: get_auth_request con 0x55aa83436d80 auth_method 0 -25> 2020-01-14 22:33:17.880 7fc9bbeaa700 2 mds.0.13470 Booting: 0: opening snap table -24> 2020-01-14 22:33:17.884 7fc9c58a5700 10 monclient: get_auth_request con 0x55aa83437680 auth_method 0 -23> 2020-01-14 22:33:17.884 7fc9c60a6700 10 monclient: get_auth_request con 0x55aa83437200 auth_method 0 -22> 2020-01-14 22:33:17.884 7fc9bceac700 1 mds.0.journaler.pq(ro) _finish_read_head loghead(trim 805306368, expire 807199928, write 807199928, stream_format 1). probing for end of log (from 807199928)... -21> 2020-01-14 22:33:17.884 7fc9bceac700 1 mds.0.journaler.pq(ro) probing for end of the log -20> 2020-01-14 22:33:17.884 7fc9bb6a9700 1 mds.0.journaler.mdlog(ro) recover start -19> 2020-01-14 22:33:17.884 7fc9bb6a9700 1 mds.0.journaler.mdlog(ro) read_head -18> 2020-01-14 22:33:17.884 7fc9bb6a9700 4 mds.0.log Waiting for journal 0x200 to recover... -17> 2020-01-14 22:33:17.884 7fc9c68a7700 10 monclient: get_auth_request con 0x55aa83437f80 auth_method 0 -16> 2020-01-14 22:33:17.884 7fc9c60a6700 10 monclient: get_auth_request con 0x55aa83438400 auth_method 0 -15> 2020-01-14 22:33:17.892 7fc9bbeaa700 1 mds.0.journaler.mdlog(ro) _finish_read_head loghead(trim 98280931328, expire 98282151365, write 98282247624, stream_format 1). probing for end of log (from 98282247624)... -14> 2020-01-14 22:33:17.892 7fc9bbeaa700 1 mds.0.journaler.mdlog(ro) probing for end of the log -13> 2020-01-14 22:33:17.892 7fc9bceac700 1 mds.0.journaler.pq(ro) _finish_probe_end write_pos = 807199928 (header had 807199928). recovered. -12> 2020-01-14 22:33:17.892 7fc9bceac700 4 mds.0.purge_queue operator(): open complete -11> 2020-01-14 22:33:17.892 7fc9bceac700 1 mds.0.journaler.pq(ro) set_writeable -10> 2020-01-14 22:33:17.892 7fc9bbeaa700 1 mds.0.journaler.mdlog(ro) _finish_probe_end write_pos = 98283021535 (header had 98282247624). recovered. -9> 2020-01-14 22:33:17.892 7fc9bb6a9700 4 mds.0.log Journal 0x200 recovered. -8> 2020-01-14 22:33:17.892 7fc9bb6a9700 4 mds.0.log Recovered journal 0x200 in format 1 -7> 2020-01-14 22:33:17.892 7fc9bb6a9700 2 mds.0.13470 Booting: 1: loading/discovering base inodes -6> 2020-01-14 22:33:17.892 7fc9bb6a9700 0 mds.0.cache creating system inode with ino:0x100 -5> 2020-01-14 22:33:17.892 7fc9bb6a9700 0 mds.0.cache creating system inode with ino:0x1 -4> 2020-01-14 22:33:17.896 7fc9bbeaa700 2 mds.0.13470 Booting: 2: replaying mds log -3> 2020-01-14 22:33:17.896 7fc9bbeaa700 2 mds.0.13470 Booting: 2: waiting for purge queue recovered -2> 2020-01-14 22:33:17.908 7fc9ba6a7700 -1 log_channel(cluster) log [ERR] : ESession.replay sessionmap v 7561128 - 1 > table 0 -1> 2020-01-14 22:33:17.912 7fc9ba6a7700 -1 /build/ceph-14.2.5/src/mds/journal.cc: In function 'virtual void ESession::replay(MDSRank*)' thread 7fc9ba6a7700 time 2020-01-14 22:33:17.912135 /build/ceph-14.2.5/src/mds/journal.cc: 1655: FAILED ceph_assert(g_conf()->mds_wipe_sessions) Am 14.01.20 um 21:19 schrieb Oskar Malnowicz: > this was the new state. the results are equal to florians > > $ time cephfs-data-scan scan_extents cephfs_data > cephfs-data-scan scan_extents cephfs_data 1.86s user 1.47s system 21% > cpu 15.397 total > > $ time cephfs-data-scan scan_inodes cephfs_data > cephfs-data-scan scan_inodes cephfs_data 2.76s user 2.05s system 26% > cpu 17.912 total > > $ time cephfs-data-scan scan_links > cephfs-data-scan scan_links 0.13s user 0.11s system 31% cpu 0.747 total > > $ time cephfs-data-scan scan_links > cephfs-data-scan scan_links 0.13s user 0.12s system 33% cpu 0.735 total > > $ time cephfs-data-scan cleanup cephfs_data > cephfs-data-scan cleanup cephfs_data 1.64s user 1.37s system 12% cpu > 23.922 total > > mds / $ du -sh > 31G > > $ df -h > ip1,ip2,ip3:/ 5.2T 2.1T 3.1T 41% /storage/cephfs_test1 > > $ ceph df detail > RAW STORAGE: > CLASS SIZE AVAIL USED RAW USED %RAW USED > hdd 7.8 TiB 7.5 TiB 312 GiB 329 GiB 4.14 > TOTAL 7.8 TiB 7.5 TiB 312 GiB 329 GiB 4.14 > > POOLS: > POOL ID STORED OBJECTS USED > %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY > USED COMPR UNDER COMPR > cephfs_data 6 2.1 TiB 2.48M 2.1 TiB > 25.00 3.1 TiB N/A N/A > 2.48M 0 B 0 B > cephfs_metadata 7 7.3 MiB 379 7.3 MiB > 0 3.1 TiB N/A N/A 379 > 0 B 0 B > > > Am 14.01.20 um 21:06 schrieb Patrick Donnelly: >> I'm asking that you get the new state of the file system tree after >> recovering from the data pool. Florian wrote that before I asked you >> to do this... >> >> How long did it take to run the cephfs-data-scan commands? >> >> On Tue, Jan 14, 2020 at 11:58 AM Oskar Malnowicz >> <oskar.malnowicz@xxxxxxxxxxxxxx> wrote: >>> as florian already wrote, `du -hc` shows a total usage of 31G, but `ceph >>> df` show us an usage of 2.1 >>> >>> </ mds># du -hs >>> 31G >>> >>> # ceph df >>> cephfs_data 6 2.1 TiB 2.48M 2.1 TiB 25.00 3.1 TiB >>> >>> Am 14.01.20 um 20:44 schrieb Patrick Donnelly: >>>> On Tue, Jan 14, 2020 at 11:40 AM Oskar Malnowicz >>>> <oskar.malnowicz@xxxxxxxxxxxxxx> wrote: >>>>> i run this commands, but still the same problems >>>> Which problems? >>>> >>>>> $ cephfs-data-scan scan_extents cephfs_data >>>>> >>>>> $ cephfs-data-scan scan_inodes cephfs_data >>>>> >>>>> $ cephfs-data-scan scan_links >>>>> 2020-01-14 20:36:45.110 7ff24200ef80 -1 mds.0.snap updating last_snap 1 >>>>> -> 27 >>>>> >>>>> $ cephfs-data-scan cleanup cephfs_data >>>>> >>>>> do you have other ideas ? >>>> After you complete this, you should see the deleted files in your file >>>> system tree (if this is indeed the issue). What's the output of `du >>>> -hc`? >>>> > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx