Re: How's cephfs going?

David McBride <dwm37@xxxxxxxxx> · Tue, 18 Jul 2017 10:51:47 +0100

On Mon, 2017-07-17 at 02:59 +0000, 许雪寒 wrote:
> Hi, everyone.
> 
> We intend to use cephfs of Jewel version, however, we don’t know its status.
> Is it production ready in Jewel? Does it still have lots of bugs? Is it a
> major effort of the current ceph development? And who are using cephfs now?

Hello,

I've been using CephFS in production in a small-scale, conservative deployment
for the past year — to provide a small backing store, spanning two datacenters,
for a highly-available web service that supplies host configuration data and
APT package repositories for a fleet of ~1000 Linux workstations.

The motivation for using CephFS was not performance, but availability and
correctness.  It replaced a fairly complicated stack involving corosync,
pacemaker, XFS, DRBD, and mdadm.

Previous production systems using DRBD had proven unreliable in practice;
network glitches between datacenters would cause DRBD to enter a split-brain
state — which, for this application, was tolerable.

Bad, however, was when DRBD failed to keep the two halves of the mirrored
filesystem in sync — at one point, they had diverged to include over a gigabyte
of differences, which caused XFS to snap to read-only mode when it detected
internal inconsistencies.

CephFS, by contrast, has been remarkably solid, even in the face of networking
interruptions.  As well as remaining available, data integrity has been
problem-free.

(In addition to Ceph's own scrubbing capabilities, I've had automated
monitoring checks verifying the file-sizes and checksums of files in each APT
repository against those recorded in their indexes, with zero discrepancies in
a year of operation.)

Write performance wasn't terrific, but my deployment has been on constrained
hardware: running all of the Ceph daemons, plus the local kernel mount, plus
the nginx web server, all on the same hosts — with quadruple replication and
only a single gigabit link per machine — is not recommended for write
throughput.

The only issue I came across was a bug in the ceph.ko driver, which would
occasionally trip a null pointer exception in the kernel.  It was possible to
avoid this bug by mounting with the noasyncreaddir flag; the bug has long since
been fixed.

Kind regards,
David
-- 
David McBride <dwm37@xxxxxxxxx>
Unix Specialist, University Information Services
Attachment:
signature.asc

Description: This is a digitally signed message part
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com