Re: Why does ceph need a filesystem (was Simulating DiskFailure)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




I have the same question too. I know Ceph was based on a simple fs of its own years ago.
I'd like to hear some more details.

------------------ Original ------------------
From:  "James Harper"<james.harper@xxxxxxxxxxxxxxxx>;
Date:  Sat, Jun 15, 2013 11:07 AM
To:  "Gregory Farnum"<greg@xxxxxxxxxxx>; "Craig Lewis"<clewis@xxxxxxxxxxxxxxxxxx>;
Cc:  "ceph-users@xxxxxxxx"<ceph-users@xxxxxxxx>;
Subject:  [ceph-users] Why does ceph need a filesystem (was Simulating DiskFailure)

>
> Yeah. You've picked up on some warty bits of Ceph's error handling here for
> sure, but it's exacerbated by the fact that you're not simulating what you
> think. In a real disk error situation the filesystem would be returning EIO or
> something, but here it's returning ENOENT. Since the OSD is authoritative for
> that key space and the filesystem says there is no such object, presto! It
> doesn't exist.
> If you restart the OSD it does a scan of the PGs on-disk as well as what it
> should have, and can pick up on the data not being there and recover. But
> "correctly" handling data that has been (from the local FS' perspective)
> properly deleted under a running process would require huge and expensive
> contortions on the part of the daemon (in any distributed system that I can
> think of).
> -Greg
>

Why was the decision made for ceph to require an underlying filesystem, rather than direct access to disk (like drbd does)?

All of my recent disk failures have been unrecoverable read errors (pending sector in SMART stats), which are easy enough to repair in the short term just by rewriting with a known good copy of the data (assuming that there isn't some other underlying cause and this was just a power-off-at-the-wrong-moment error). Unfortunately because of the disconnect between ceph and the LBA this can't be done by ceph.

Just curious...

Thanks

James
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux