Re: Cannot mount CephFS after irreversible OSD lost

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Yan,

Thanks for your reply.

The problem is that the back-up I've made was done after the data corruption (but before any manipulations with the journal). Since FS cannot be mounted via in-kernel client, I tend to believe that cephfs_metadata corruption is the cause.

Since I do have a read-only access to the filesystem via ceph-fuse, I would rather prefer to repair it using cephfs-data-scan tool. 

I did 'rsync --dry-run' of the whole FS and MDS complained about a few missing objects. Not really sure if it is really a trustable method for identifying corrupted things, but if it is, the damage is marginal.

So the question is cephfs-data-scan designed to resolve problems with duplicated inodes? 
 

On 19 November 2015 at 04:17, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
On Wed, Nov 18, 2015 at 5:21 PM, Mykola Dvornik <mykola.dvornik@xxxxxxxxx> wrote:
Hi John,

It turned out that mds triggers an assertion

mds/MDCache.cc: 269: FAILED assert(inode_map.count(in->vino()) == 0)

on any attempt to write data to the filesystem mounted via fuse.

Deleting data is still OK.

I cannot really follow why duplicated inodes appear.

Are there any ways to flush/reset the MDS cache?



this may caused by session/journal reset. could you try restoring backup of your metadata pool.

Yan, Zheng


 

On 17 November 2015 at 13:26, John Spray <jspray@xxxxxxxxxx> wrote:
On Tue, Nov 17, 2015 at 12:17 PM, Mykola Dvornik
<mykola.dvornik@xxxxxxxxx> wrote:
> Dear John,
>
> Thanks for such a prompt reply!
>
> Seems like something happens on the mon side, since there are no
> mount-specific requests logged on the mds side (see below).
> FYI, some hours ago I've disabled auth completely, but it didn't help.
>
> The serialized metadata pool is 9.7G. I can try to compress it with 7z, then
> setup rssh account for you to scp/rsync it.
>
> debug mds = 20
> debug mon = 20

Don't worry about the mon logs.  That MDS log snippet appears to be
from several minutes earlier than the client's attempt to mount.

In these cases it's generally simpler if you truncate all the logs,
then attempt the mount, then send all the logs in full rather than
snippets, so that we can be sure nothing is missing.

Please also get the client log (use the fuse client with --debug-client=20).

John



--
 Mykola 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





--
 Mykola 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux