Hi Everyone, I'm running Ceph Nautilus on CentOS7, using NFS-Ganesha to serve a couple CentOS 6 clients using CephFS. We have 180 OSDs, each a 12TB disk evenly spread across 6 servers. Fairly often, I'll receive something like: OBJECT_UNFOUND 1/231940937 objects unfound (0.000%) pg 1.542 has 1 unfound objects It's usually a very small number of unfound objects. I can't determine what is causing this to occur, but when it does it hangs the NFS mounts, but not the CephFS mounts on other servers. This leads me to believe NFS is the culprit somehow. When this happens the way I recover the NFS service is to revert the unfound object: ceph pg 1.542 mark_unfound_lost revert Reading the RedHat Documentation <https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/troubleshooting_guide/troubleshooting-placement-groups#unfound-objects>, it states that this could happen if OSDs are going on and offline, which is not happening. All the OSDs are stable. This service had been VERY stable for weeks, until I ran the latest upgrades last evening from 14.2.20 to 14.2.22. Now it's been happening frequently (4 times so far since last evening). I have a couple questions: 1. Is there a way to tell which files are associated with a PG in a CephFS? My thinking is that knowing the file locations and what's going on with these files at that time might provide a clue. And whether I'm losing data when performing a revert. 2. What does "revert" really do? 3. What troubleshooting might I do to pinpoint the cause? Thanks for any pointers, I'm fairly new to Ceph. Jeff Turmelle — Jeff Turmelle, Lead Systems Analyst International Research Institute for Climate and Society <http://iri.columbia.edu/> The Earth Institute <http://www.earthinstitute.columbia.edu/> at Columbia University <http://www.columbia.edu/> cell: (845) 652-3461 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx