Re: Deprecating ext4 support

Gregory Farnum <gfarnum@xxxxxxxxxx> · Mon, 18 Apr 2016 11:46:18 -0700



On Sun, Apr 17, 2016 at 9:05 PM, Christian Balzer <chibi@xxxxxxx> wrote:
>
> Hello,
>
> On Fri, 15 Apr 2016 08:20:45 +0200 Michael Metz-Martini | SpeedPartner
> GmbH wrote:
>
>> Hi,
>>
>> Am 15.04.2016 um 07:43 schrieb Christian Balzer:
>> > On Fri, 15 Apr 2016 07:02:13 +0200 Michael Metz-Martini | SpeedPartner
>> > GmbH wrote:
>> >> Am 15.04.2016 um 03:07 schrieb Christian Balzer:
>> >>>> We thought this was a good idea so that we can change the
>> >>>> replication size different for doc_root and raw-data if we like.
>> >>>> Seems this was a bad idea for all objects.
>> [...]
>> >>> If nobody else has anything to say about this, I'd consider filing a
>> >>> bug report.
>> >> Im must admit that we're currently using 0.87 (Giant) and haven't
>> >> upgraded so far. Would be nice to know if upgrade would "clean" this
>> >> state or we should better start with a new cluster ... :(
>
> Actually, I ran some more tests, with larger and differing data sets.
>
> I can now replicate this behavior here, before:
> ---
>     NAME          ID     USED       %USED     MAX AVAIL     OBJECTS
>     data          0       6224M      0.11         1175G        1870
>     metadata      1      18996k         0         1175G          24
>     filegoats     10       468M         0         1175G        1346
> ---
>
> And after copying /usr/ from the client were that CephFS is mounted to the
> directory mapped to "filegoats":
> ---
>     data          0       6224M      0.11         1173G       47274
>     metadata      1      42311k         0         1173G        4057
>     filegoats     10      1642M      0.03         1173G       43496
> ---
>
> So not a "bug" per se, but not exactly elegant when considering the object
> overhead.
> This feels a lot like how cache-tiering is implemented as well (evicted
> objects get zero'd, not deleted).
>
> I guess the best strategy here is do to have the vast majority of data in
> "data" and only special cases in other pools (like SSD based ones).
>
> Would be nice if somebody from the devs, RH could pipe up and the
> documentation updated to reflect this.

It's not really clear to me what test you're running here. But if
you're talking about lots of empty RADOS objects, you're probably
running into the backtraces. Objects store (often stale) backtraces of
their directory path in an xattr for disaster recovery and lookup. But
to facilitate that lookup, they need to be visible without knowing
anything about the data placement, so if you hav ea bunch of files
elsewhere it still puts a pointer backtrace in the default file data
pool.
Although I think we've talked about ways to avoid that and maybe did
something to improve it by Jewel, but I don't remember for certain.
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com