> -----Original Message----- > From: wr@xxxxxxxx [mailto:wr@xxxxxxxx] > Sent: 21 July 2016 13:23 > To: nick@xxxxxxxxxx; 'Horace Ng' <horace@xxxxxxxxx> > Cc: ceph-users@xxxxxxxxxxxxxx > Subject: Re: Ceph + VMware + Single Thread Performance > > Okay and what is your plan now to speed up ? Now I have come up with a lower latency hardware design, there is not much further improvement until persistent RBD caching is implemented, as you will be moving the SSD/NVME closer to the client. But I'm happy with what I can achieve at the moment. You could also experiment with bcache on the RBD. > > Would it help to put in multiple P3700 per OSD Node to improve performance for a single Thread (example Storage VMotion) ? Most likely not, it's all the other parts of the puzzle which are causing the latency. ESXi was designed for storage arrays that service IO's in 100us-1ms range, Ceph is probably about 10x slower than this, hence the problem. Disable the BBWC on a RAID controller or SAN and you will the same behaviour. > > Regards > > > Am 21.07.16 um 14:17 schrieb Nick Fisk: > >> -----Original Message----- > >> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf > >> Of wr@xxxxxxxx > >> Sent: 21 July 2016 13:04 > >> To: nick@xxxxxxxxxx; 'Horace Ng' <horace@xxxxxxxxx> > >> Cc: ceph-users@xxxxxxxxxxxxxx > >> Subject: Re: Ceph + VMware + Single Thread Performance > >> > >> Hi, > >> > >> hmm i think 200 MByte/s is really bad. Is your Cluster in production right now? > > It's just been built, not running yet. > > > >> So if you start a storage migration you get only 200 MByte/s right? > > I wish. My current cluster (not this new one) would storage migrate at > > ~10-15MB/s. Serial latency is the problem, without being able to > > buffer, ESXi waits on an ack for each IO before sending the next. Also it submits the migrations in 64kb chunks, unless you get VAAI > working. I think esxi will try and do them in parallel, which will help as well. > > > >> I think it would be awesome if you get 1000 MByte/s > >> > >> Where is the Bottleneck? > > Latency serialisation, without a buffer, you can't drive the devices > > to 100%. With buffered IO (or high queue depths) I can max out the journals. > > > >> A FIO Test from Sebastien Han give us 400 MByte/s raw performance from the P3700. > >> > >> https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your > >> -ssd-is-suitable-as-a-journal-device/ > >> > >> How could it be that the rbd client performance is 50% slower? > >> > >> Regards > >> > >> > >> Am 21.07.16 um 12:15 schrieb Nick Fisk: > >>> I've had a lot of pain with this, smaller block sizes are even worse. > >>> You want to try and minimize latency at every point as there is no > >>> buffering happening in the iSCSI stack. This means:- > >>> > >>> 1. Fast journals (NVME or NVRAM) > >>> 2. 10GB or better networking > >>> 3. Fast CPU's (Ghz) > >>> 4. Fix CPU c-state's to C1 > >>> 5. Fix CPU's Freq to max > >>> > >>> Also I can't be sure, but I think there is a metadata update > >>> happening with VMFS, particularly if you are using thin VMDK's, this > >>> can also be a major bottleneck. For my use case, I've switched over to NFS as it has given much more performance at scale and > less headache. > >>> > >>> For the RADOS Run, here you go (400GB P3700): > >>> > >>> Total time run: 60.026491 > >>> Total writes made: 3104 > >>> Write size: 4194304 > >>> Object size: 4194304 > >>> Bandwidth (MB/sec): 206.842 > >>> Stddev Bandwidth: 8.10412 > >>> Max bandwidth (MB/sec): 224 > >>> Min bandwidth (MB/sec): 180 > >>> Average IOPS: 51 > >>> Stddev IOPS: 2 > >>> Max IOPS: 56 > >>> Min IOPS: 45 > >>> Average Latency(s): 0.0193366 > >>> Stddev Latency(s): 0.00148039 > >>> Max latency(s): 0.0377946 > >>> Min latency(s): 0.015909 > >>> > >>> Nick > >>> > >>>> -----Original Message----- > >>>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On > >>>> Behalf Of Horace > >>>> Sent: 21 July 2016 10:26 > >>>> To: wr@xxxxxxxx > >>>> Cc: ceph-users@xxxxxxxxxxxxxx > >>>> Subject: Re: Ceph + VMware + Single Thread Performance > >>>> > >>>> Hi, > >>>> > >>>> Same here, I've read some blog saying that vmware will frequently > >>>> verify the locking on VMFS over iSCSI, hence it will have much slower performance than NFS (with different locking mechanism). > >>>> > >>>> Regards, > >>>> Horace Ng > >>>> > >>>> ----- Original Message ----- > >>>> From: wr@xxxxxxxx > >>>> To: ceph-users@xxxxxxxxxxxxxx > >>>> Sent: Thursday, July 21, 2016 5:11:21 PM > >>>> Subject: Ceph + VMware + Single Thread Performance > >>>> > >>>> Hi everyone, > >>>> > >>>> we see at our cluster relatively slow Single Thread Performance on the iscsi Nodes. > >>>> > >>>> > >>>> Our setup: > >>>> > >>>> 3 Racks: > >>>> > >>>> 18x Data Nodes, 3 Mon Nodes, 3 iscsi Gateway Nodes with tgt (rbd cache off). > >>>> > >>>> 2x Samsung SM863 Enterprise SSD for Journal (3 OSD per SSD) and 6x > >>>> WD Red 1TB per Data Node as OSD. > >>>> > >>>> Replication = 3 > >>>> > >>>> chooseleaf = 3 type Rack in the crush map > >>>> > >>>> > >>>> We get only ca. 90 MByte/s on the iscsi Gateway Servers with: > >>>> > >>>> rados bench -p rbd 60 write -b 4M -t 1 > >>>> > >>>> > >>>> If we test with: > >>>> > >>>> rados bench -p rbd 60 write -b 4M -t 32 > >>>> > >>>> we get ca. 600 - 700 MByte/s > >>>> > >>>> > >>>> We plan to replace the Samsung SSD with Intel DC P3700 PCIe NVM'e > >>>> for the Journal to get better Single Thread Performance. > >>>> > >>>> Is anyone of you out there who has an Intel P3700 for Journal an > >>>> can give me back test results with: > >>>> > >>>> > >>>> rados bench -p rbd 60 write -b 4M -t 1 > >>>> > >>>> > >>>> Thank you very much !! > >>>> > >>>> Kind Regards !! > >>>> > >>>> _______________________________________________ > >>>> ceph-users mailing list > >>>> ceph-users@xxxxxxxxxxxxxx > >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>>> _______________________________________________ > >>>> ceph-users mailing list > >>>> ceph-users@xxxxxxxxxxxxxx > >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@xxxxxxxxxxxxxx > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com