John, thanks for the tips. I ran a recursive long listing of the cephfs volume, and didn’t receive any errors. So I guess it wasn’t serious. I also tried running the following from ceph tell mds.0 damage ls 2016-09-16 07:11:36.824330 7fc2ff00e700 0 client.224234 ms_handle_reset on 192.168.19.243:6804/3448 Error EPERM: problem getting command descriptions from mds.0 I’m guessing this output means something didn’t work. Any other suggestions with this command? Thanks, Jim Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10 From: John Spray<mailto:jspray@xxxxxxxxxx> Sent: Friday, September 16, 2016 2:37 AM To: Jim Kilborn<mailto:jim@xxxxxxxxxxxx> Cc: ceph-users@xxxxxxxxxxxxxx<mailto:ceph-users@xxxxxxxxxxxxxx> Subject: Re: mds damage detected - Jewel On Thu, Sep 15, 2016 at 10:30 PM, Jim Kilborn <jim@xxxxxxxxxxxx> wrote: > I have a replicated cache pool and metadata pool which reside on ssd drives, with a size of 2, backed by a erasure coded data pool > The cephfs filesystem was in a healthy state. I pulled an SSD drive, to perform an exercise in osd failure. > > The cluster recognized the ssd failure, and replicated back to a healthy state, but I got a message saying the mds0 Metadata damage detected. > > > cluster 62ed97d6-adf4-12e4-8fd5-3d9701b22b86 > health HEALTH_ERR > mds0: Metadata damage detected > mds0: Client master01.div18.swri.org failing to respond to cache pressure > monmap e2: 3 mons at {ceph01=192.168.19.241:6789/0,ceph02=192.168.19.242:6789/0,ceph03=192.168.19.243:6789/0} > election epoch 24, quorum 0,1,2 ceph01,darkjedi-ceph02,darkjedi-ceph03 > fsmap e25: 1/1/1 up {0=-ceph04=up:active}, 1 up:standby > osdmap e1327: 20 osds: 20 up, 20 in > flags sortbitwise > pgmap v11630: 1536 pgs, 3 pools, 100896 MB data, 442 kobjects > 201 GB used, 62915 GB / 63116 GB avail > 1536 active+clean > > In the mds logs of the active mds, I see the following: > > 7fad0c4b2700 0 -- 192.168.19.244:6821/17777 >> 192.168.19.243:6805/5090 pipe(0x7fad25885400 sd=56 :33513 s=1 pgs=0 cs=0 l=1 c=0x7fad2585f980).fault > 7fad14add700 0 mds.beacon.darkjedi-ceph04 handle_mds_beacon no longer laggy > 7fad101d3700 0 mds.0.cache.dir(10000016c08) _fetched missing object for [dir 10000016c08 /usr/ [2,head] auth v=0 cv=0/0 ap=1+0+0 state=1073741952 f() n() hs=0+0,ss=0+0 | waiter=1 authpin=1 0x7fad25ced500] > 7fad101d3700 -1 log_channel(cluster) log [ERR] : dir 10000016c08 object missing on disk; some files may be lost > 7fad0f9d2700 0 -- 192.168.19.244:6821/17777 >> 192.168.19.242:6800/3746 pipe(0x7fad25a4e800 sd=42 :0 s=1 pgs=0 cs=0 l=1 c=0x7fad25bd5180).fault > 7fad14add700 -1 log_channel(cluster) log [ERR] : unmatched fragstat size on single dirfrag 10000016c08, inode has f(v0 m2016-09-14 14:00:36.654244 13=1+12), dirfrag has f(v0 m2016-09-14 14:00:36.654244 1=0+1) > 7fad14add700 -1 log_channel(cluster) log [ERR] : unmatched rstat rbytes on single dirfrag 10000016c08, inode has n(v77 rc2016-09-14 14:00:36.654244 b1533163206 48173=43133+5040), dirfrag has n(v77 rc2016-09-14 14:00:36.654244 1=0+1) > 7fad101d3700 -1 log_channel(cluster) log [ERR] : unmatched rstat on 10000016c08, inode has n(v78 rc2016-09-14 14:00:36.656244 2=0+2), dirfrags have n(v0 rc2016-09-14 14:00:36.656244 3=0+3) > > I’m not sure why the metadata got damaged, since its being replicated, but I want to fix the issue, and test again. However, I cant figure out the steps to repair the metadata. Losing an object like that is almost certainly a sign that you've hit a bug -- probably an OSD bug if it was the OSDs being disrupted while the MDS daemons continued to run. The subsequent "unmatched fragstat" etc messages are probably a red herring where the stats are only bad because the object is missing, not because of some other issue (http://tracker.ceph.com/issues/17284) > I saw something about running a damage ls, but I can’t seem to find a more detailed repair document. Any pointers to get the metadata fixed? Seems both my mds daemons are running correctly, but that error bothers me. Shouldn’t happen I think. You can get the detail on what's damaged with "ceph tell mds.<id> damage ls" -- this spits out JSON that you may well want to parse with a tiny python script. > > I tried the following command, but it doesn’t understand it…. > ceph --admin-daemon /var/run/ceph/ceph-mds. ceph03.asok damage ls > > > I then rebooted all 4 ceph servers simultaneously (another stress test), and the ceph cluster came back up healthy, and the mds damaged status has been cleared!! I then replaced the ssd, put it back into service, and let the backfill complete. The cluster was fully healthy. I pulled another ssd, and repeated this process, yet I never got the damaged mds messages. Was this just a random metadata damage due to yanking a drive out? Is there any lingering affects of the metadata that I need to address? The MDS damage table is an ephemeral structure, so when you reboot it will forget about the damage. I would expect that doing a "ls -R" on your filsystem will cause the damage to be detected again as it traverses the filesystem, although if it doesn't then that would be a sign that the "missing object" was actually a bug failing to find it one time, rather than a bug where the object has really been lost. John > > > - Jim > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com