On Fri, Jun 15, 2018 at 2:55 PM, Benjeman Meekhof <bmeekhof@xxxxxxxxx> wrote: > Have seen some posts and issue trackers related to this topic in the > past but haven't been able to put it together to resolve the issue I'm > having. All on Luminous 12.2.5 (upgraded over time from past > releases). We are going to upgrade to Mimic near future if that would > somehow resolve the issue. > > Summary: > > 1. We have a CephFS data pool which has steadily and slowly grown in > size without corresponding writes to the directory placed on it - a > plot of usage over a few hours shows a very regular upward rate of > increase. The pool is now 300TB vs 16TB of actual space used in > directory. > > 2. Reading through some email posts and issue trackers led me to > disabling 'standby replay' though we are not and have not ever used > snapshots. Disabling that feature on our 3 MDS stopped the steady > climb. However the pool remains with 300TB of unaccounted for space > usage. http://tracker.ceph.com/issues/19593 and > http://tracker.ceph.com/issues/21551 This is pretty strange -- if you were already on 12.2.5 then the http://tracker.ceph.com/issues/19593 should have been fixed and switching standby replays on/off shouldn't make a difference (unless there's some similar bug that crept back into luminous). > 3. I've never had any issue starting the MDS or with filesystem > functionality but looking through the mds logs I see a single > 'journaler.pg(rw) _decode error from assimilate_prefetch' at every > startup. A log snippet with context is below with debug_mds and > debug_journaler at 20. This message suggests that the purge queue has been corrupted, but the MDS is ignoring this -- something is wrong with the error handling. The MDS should be marked damaged when something like this happens, but in this case PurgeQueue is apparently dropping the error on the floor after it gets logged by Journaler. I've opened a ticket+PR for the error handling here: http://tracker.ceph.com/issues/24533 (however, the loading path in PurgeQueue::_recover *does* have error handling so I'm not clear why that isn't happening in your case). I believe cephfs-journal-tool in mimic was enhanced to be able to optionally operate on the purge queue as well as the metadata journal (they use the same underlying format), so upgrading to mimic would give you better tooling for debugging this. John > As noted, there is at least one past email thread on the topic but I'm > not quite having the same issue as this person and I couldn't glean > any information as to what I should do to repair this error and get > stale objects purged from this pool (if that is in fact the issue): > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021379.html > > Any thoughts on troubleshooting steps I could try next? > > Here is the log snippet: > > 2018-06-15 09:14:50.746831 7fb47251b700 20 mds.0.journaler.pq(rw) > write_buf_throttle get, delta 101 > 2018-06-15 09:14:50.746835 7fb47251b700 10 mds.0.journaler.pq(rw) > append_entry len 81 to 88121773~101 > 2018-06-15 09:14:50.746838 7fb47251b700 10 mds.0.journaler.pq(rw) _prefetch > 2018-06-15 09:14:50.746863 7fb47251b700 20 mds.0.journaler.pq(rw) > write_buf_throttle get, delta 101 > 2018-06-15 09:14:50.746864 7fb47251b700 10 mds.0.journaler.pq(rw) > append_entry len 81 to 88121874~101 > 2018-06-15 09:14:50.746867 7fb47251b700 10 mds.0.journaler.pq(rw) _prefetch > 2018-06-15 09:14:50.746901 7fb46fd16700 10 mds.0.journaler.pq(rw) > _finish_read got 6822392~1566216 > 2018-06-15 09:14:50.746909 7fb46fd16700 10 mds.0.journaler.pq(rw) > _assimilate_prefetch 6822392~1566216 > 2018-06-15 09:14:50.746911 7fb46fd16700 10 mds.0.journaler.pq(rw) > _assimilate_prefetch gap of 4194304 from received_pos 8388608 to first > prefetched buffer 12582912 > 2018-06-15 09:14:50.746913 7fb46fd16700 10 mds.0.journaler.pq(rw) > _assimilate_prefetch read_buf now 6822392~1566216, read pointers > 6822392/8388608/50331648 > > === error here ===> 2018-06-15 09:14:50.746965 7fb46fd16700 -1 > mds.0.journaler.pq(rw) _decode error from assimilate_prefetch > > 2018-06-15 09:14:50.746994 7fb47251b700 20 mds.0.journaler.pq(rw) > write_buf_throttle get, delta 101 > 2018-06-15 09:14:50.746998 7fb47251b700 10 mds.0.journaler.pq(rw) > append_entry len 81 to 88121975~101 > 2018-06-15 09:14:50.747007 7fb47251b700 10 mds.0.journaler.pq(rw) > wait_for_readable at 6822392 onreadable 0x557ee0f58300 > 2018-06-15 09:14:50.747042 7fb47251b700 20 mds.0.journaler.pq(rw) > write_buf_throttle get, delta 101 > 2018-06-15 09:14:50.747043 7fb47251b700 10 mds.0.journaler.pq(rw) > append_entry len 81 to 88122076~101 > 2018-06-15 09:14:50.747063 7fb47251b700 20 mds.0.journaler.pq(rw) > write_buf_throttle get, delta 101 > 2018-06-15 09:14:50.747064 7fb47251b700 10 mds.0.journaler.pq(rw) > append_entry len 81 to 88122177~101 > 2018-06-15 09:14:50.747113 7fb47251b700 20 mds.0.journaler.pq(rw) > write_buf_throttle get, delta 101 > 2018-06-15 09:14:50.747114 7fb47251b700 10 mds.0.journaler.pq(rw) > append_entry len 81 to 88122278~101 > 2018-06-15 09:14:50.747136 7fb47251b700 20 mds.0.journaler.pq(rw) > write_buf_throttle get, delta 101 > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com