CephFS in general has a lot fewer metadata structures than traditional filesystems generally do; about the only thing that could go wrong without users noticing directly is: 1) The data gets corrupted 2) Files somehow get removed from folders. Data corruption is something RADOS is responsible for detecting through its scrub processes and things. If CephFS actually dropped a file, yeah, that'd be a problem which we don't have other mechanisms of detecting at this time. But the more traditional sort of fsck activities like looking for doubly-linked blocks or multiply-allocated inodes are more or less impossible given our decentralized architecture and lack of stored metadata (for instance, data "blocks" are just objects whose name is calculated based on the inode number and the offset within the file). If it makes you feel better, fsck is something I've been working on a lot recently, based on the design discussions we had early last year. The first pass is just a "scrubbing" mechanism to make sure that the hierarchy is self-consistent and the referenced RADOS objects actually exist; later we'll move on to checking that each RADOS object is associated with a particular file. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Sep 15, 2014 at 4:15 PM, brandon li <brandon.li.1ca at gmail.com> wrote: > Thanks for the reply, Greg. > > With traditional file system experience, I have to admit it will take me > some time to get used to the way CephFS works. Considering it as part of my > learning curve. :-) > > One of concerns I have it that, without tools like fsck, how could we know > the file system is still consistent? > > Even RADOS doesn't report error, could there be any miscommunication(e.g., > due to bug, networking issue, disk bit flip, ...) between metadata operation > and stripe I/O? > > For example, the first stripe of a file is created but its inode id(on > RADOS) is wrong for some reason, and thus RADOS doesn't think it belongs to > the correct file. This may never happen and I just use it here to explain my > concern. > > Thanks, > Brandon > > > On Mon, Sep 15, 2014 at 3:49 PM, Gregory Farnum <greg at inktank.com> wrote: >> >> On Mon, Sep 15, 2014 at 3:23 PM, brandon li <brandon.li.1ca at gmail.com> >> wrote: >> > If it's true, is there any other tools I can use to check and repair the >> > file system? >> >> Not much, no. That said, you shouldn't really need an fsck unless the >> underlying RADOS store went through some catastrophic event. Is there >> anything in particular you're worried about? >> -Greg >> Software Engineer #42 @ http://inktank.com | http://ceph.com > >