NVMe journals, or any Ceph filestore journal, is really only used for writes. If you watch the IO on your journals, you will see this. The only thing you get here is a faster write ACK, if you have chosen the right NVMe device. We have tested some slower than SATA SSD. A bit of a waste of lots of extra space in some cases. We leverage the NVMe devices to accelerate our writes and reads. And we get the benefit of the flush from journal to OSD being faster too. RGW indexes are one example of this. It is not uncommon for us to see both high bandwidth and high IOPS requests come in at the same time from the same Hadoop job. Fortunately our big data team has been much happier now that we have moved to this sort of setup. Warren > On May 11, 2017, at 4:41 AM, Matthew Vernon <mv3@xxxxxxxxxxxx> wrote: > >> On 08/05/17 04:57, Warren Wang - ISD wrote: >> >> A little extra background here. If Ceph directly supported LVM >> devices as OSDs, we probably wouldn’t have to do what we’re doing >> now. We don’t know of a way to be able to use LVM cache device as an >> OSD without this type of config. This is primarily to support big >> data workloads that use object storage as the only backing storage. >> So the type of IO that we see is highly irregular, compared to most >> object storage workloads. > > This is probably a stupid question, but does using your NVMe as ceph journals (which is what we do) not do the same sort of job as LVM cache devices? > > Regards, > > Matthew > > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. ��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f