Thanks for the response. I was also hoping to be able to debug better once we got onto Mimic. We just finished that upgrade yesterday and cephfs-journal-tool does find a corruption in the purge queue though our MDS continues to startup and the filesystem appears to be functional as usual. How can I modify the purge queue to remove damaged sections? Is there some way to scan known FS objects and remove any that might now be orphaned once the damage is removed/repaired? # cephfs-journal-tool --journal=purge_queue journal inspect Overall journal integrity: DAMAGED Corrupt regions: 0x6819f8-681a55 # cephfs-journal-tool --journal=purge_queue header get { "magic": "ceph fs volume v011", "write_pos": 203357732, "expire_pos": 6822392, "trimmed_pos": 4194304, "stream_format": 1, "layout": { "stripe_unit": 4194304, "stripe_count": 1, "object_size": 4194304, "pool_id": 64, "pool_ns": "" } } thanks, Ben On Fri, Jun 15, 2018 at 11:54 AM, John Spray <jspray@xxxxxxxxxx> wrote: > On Fri, Jun 15, 2018 at 2:55 PM, Benjeman Meekhof <bmeekhof@xxxxxxxxx> wrote: >> Have seen some posts and issue trackers related to this topic in the >> past but haven't been able to put it together to resolve the issue I'm >> having. All on Luminous 12.2.5 (upgraded over time from past >> releases). We are going to upgrade to Mimic near future if that would >> somehow resolve the issue. >> >> Summary: >> >> 1. We have a CephFS data pool which has steadily and slowly grown in >> size without corresponding writes to the directory placed on it - a >> plot of usage over a few hours shows a very regular upward rate of >> increase. The pool is now 300TB vs 16TB of actual space used in >> directory. >> >> 2. Reading through some email posts and issue trackers led me to >> disabling 'standby replay' though we are not and have not ever used >> snapshots. Disabling that feature on our 3 MDS stopped the steady >> climb. However the pool remains with 300TB of unaccounted for space >> usage. http://tracker.ceph.com/issues/19593 and >> http://tracker.ceph.com/issues/21551 > > This is pretty strange -- if you were already on 12.2.5 then the > http://tracker.ceph.com/issues/19593 should have been fixed and > switching standby replays on/off shouldn't make a difference (unless > there's some similar bug that crept back into luminous). > >> 3. I've never had any issue starting the MDS or with filesystem >> functionality but looking through the mds logs I see a single >> 'journaler.pg(rw) _decode error from assimilate_prefetch' at every >> startup. A log snippet with context is below with debug_mds and >> debug_journaler at 20. > > This message suggests that the purge queue has been corrupted, but the > MDS is ignoring this -- something is wrong with the error handling. > The MDS should be marked damaged when something like this happens, but > in this case PurgeQueue is apparently dropping the error on the floor > after it gets logged by Journaler. I've opened a ticket+PR for the > error handling here: http://tracker.ceph.com/issues/24533 (however, > the loading path in PurgeQueue::_recover *does* have error handling so > I'm not clear why that isn't happening in your case). > > I believe cephfs-journal-tool in mimic was enhanced to be able to > optionally operate on the purge queue as well as the metadata journal > (they use the same underlying format), so upgrading to mimic would > give you better tooling for debugging this. > > John > > >> As noted, there is at least one past email thread on the topic but I'm >> not quite having the same issue as this person and I couldn't glean >> any information as to what I should do to repair this error and get >> stale objects purged from this pool (if that is in fact the issue): >> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021379.html >> >> Any thoughts on troubleshooting steps I could try next? >> >> Here is the log snippet: >> >> 2018-06-15 09:14:50.746831 7fb47251b700 20 mds.0.journaler.pq(rw) >> write_buf_throttle get, delta 101 >> 2018-06-15 09:14:50.746835 7fb47251b700 10 mds.0.journaler.pq(rw) >> append_entry len 81 to 88121773~101 >> 2018-06-15 09:14:50.746838 7fb47251b700 10 mds.0.journaler.pq(rw) _prefetch >> 2018-06-15 09:14:50.746863 7fb47251b700 20 mds.0.journaler.pq(rw) >> write_buf_throttle get, delta 101 >> 2018-06-15 09:14:50.746864 7fb47251b700 10 mds.0.journaler.pq(rw) >> append_entry len 81 to 88121874~101 >> 2018-06-15 09:14:50.746867 7fb47251b700 10 mds.0.journaler.pq(rw) _prefetch >> 2018-06-15 09:14:50.746901 7fb46fd16700 10 mds.0.journaler.pq(rw) >> _finish_read got 6822392~1566216 >> 2018-06-15 09:14:50.746909 7fb46fd16700 10 mds.0.journaler.pq(rw) >> _assimilate_prefetch 6822392~1566216 >> 2018-06-15 09:14:50.746911 7fb46fd16700 10 mds.0.journaler.pq(rw) >> _assimilate_prefetch gap of 4194304 from received_pos 8388608 to first >> prefetched buffer 12582912 >> 2018-06-15 09:14:50.746913 7fb46fd16700 10 mds.0.journaler.pq(rw) >> _assimilate_prefetch read_buf now 6822392~1566216, read pointers >> 6822392/8388608/50331648 >> >> === error here ===> 2018-06-15 09:14:50.746965 7fb46fd16700 -1 >> mds.0.journaler.pq(rw) _decode error from assimilate_prefetch >> >> 2018-06-15 09:14:50.746994 7fb47251b700 20 mds.0.journaler.pq(rw) >> write_buf_throttle get, delta 101 >> 2018-06-15 09:14:50.746998 7fb47251b700 10 mds.0.journaler.pq(rw) >> append_entry len 81 to 88121975~101 >> 2018-06-15 09:14:50.747007 7fb47251b700 10 mds.0.journaler.pq(rw) >> wait_for_readable at 6822392 onreadable 0x557ee0f58300 >> 2018-06-15 09:14:50.747042 7fb47251b700 20 mds.0.journaler.pq(rw) >> write_buf_throttle get, delta 101 >> 2018-06-15 09:14:50.747043 7fb47251b700 10 mds.0.journaler.pq(rw) >> append_entry len 81 to 88122076~101 >> 2018-06-15 09:14:50.747063 7fb47251b700 20 mds.0.journaler.pq(rw) >> write_buf_throttle get, delta 101 >> 2018-06-15 09:14:50.747064 7fb47251b700 10 mds.0.journaler.pq(rw) >> append_entry len 81 to 88122177~101 >> 2018-06-15 09:14:50.747113 7fb47251b700 20 mds.0.journaler.pq(rw) >> write_buf_throttle get, delta 101 >> 2018-06-15 09:14:50.747114 7fb47251b700 10 mds.0.journaler.pq(rw) >> append_entry len 81 to 88122278~101 >> 2018-06-15 09:14:50.747136 7fb47251b700 20 mds.0.journaler.pq(rw) >> write_buf_throttle get, delta 101 >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com