On Thu, Sep 18, 2014 at 3:09 AM, Marc <mail at shoowin.de> wrote: > Hi, > > we did run a deep scrub on everything yesterday, and a repair > afterwards. Then a new deep scrub today, which brought new scrub errors. > > I did check the osd config, they report "filestore_xfs_extsize": "false", > as it should be if I understood things correctly. > > FTR the deep scrub has been initiated like this: > > for pgnum in `ceph pg dump|grep active|awk '{print $1}'`; do ceph pg > deep-scrub $pgnum; done > > How do we proceed from here? Did the deep scrubs all actually complete yesterday, so these are new errors and not just scrubs which weren't finished until now? If so, I'd start looking at the scrub errors and which OSDs are involved. Hopefully they'll have one or a few OSDs in common that you can examine more closely. But like I said before, my money's on faulty hardware or local filesystems. Depending on how you're set up it's probably a good idea to just start checking dmesg for any indications of trouble before you start tackling it from the RADOS side. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com