Re: journal or cache tier on SSDs ?

"Jonathan D. Proulx" <jon@xxxxxxxxxxxxx> · Wed, 11 May 2016 11:27:28 -0400

On Tue, May 10, 2016 at 10:40:08AM +0200, Yoann Moulin wrote:

:RadowGW (S3 and maybe swift for hadoop/spark) will be the main usage. Most of
:the access will be in read only mode. Write access will only be done by the
:admin to update the datasets.

No one seems to have pointed this out, but if your write workload isn't
performance sensitive there's no point in using SSD for journals.

Whether you can/should repurpose as a cache tier is another issue. I
don't have any experince with that so can not comment.

But I think you should not use them as journals becasue each SSD
becomes a single point of failure for multiple OSDs. I'm using
mirrored 3600 series SSDs for journaling but they're the same
generation and subject to identical write loads so I'm suspicious
about wether this is useful or just twice as expensive.

There's also additional complexity in deploy and management when you
split off the journals just becuase it's a more complex system.  This
part isn't too bad and can mostly be automated away, but if you don't
need the performance why pay it.

I too work in an accademic research lab, so if you need to keep the
donor happy by all means decide which way with the system is better.
Leaving them as journal if cache doesn't fit isn't likely to cause
much harm so long as you're replicating your data and can survive an
ssd loss but you should do that to survive a spinning disk loss or
storage node loss anyway.

But if I were you my choice would be between caching and moving them
to a non-ceph use.

-Jon
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com