Thanks! In jewel, as you mentioned, there will be "--max-objects" and "--object-size" options. That hint will go away or mitigate /w those options. Collect? Are those options available in: # ceph -v ceph version 10.0.2 (86764eaebe1eda943c59d7d784b893ec8b0c6ff9)?? Rgds, Shinobu ----- Original Message ----- From: "Josh Durgin" <jdurgin@xxxxxxxxxx> To: "Jan Schermer" <jan@xxxxxxxxxxx> Cc: ceph-users@xxxxxxxxxxxxxx Sent: Saturday, February 27, 2016 7:57:44 AM Subject: Re: Observations with a SSD based pool under Hammer On 02/26/2016 01:42 PM, Jan Schermer wrote: > RBD backend might be even worse, depending on how large dataset you try. One 4KB block can end up creating a 4MB object, and depending on how well hole-punching and fallocate works on your system you could in theory end up with a >1000 amplification if you always hit a different 4MB chunk (but that's not realistic). > Is that right? Yes, the size hints rbd sends with writes will end up as an xfs ioctl to ask for MIN(rbd object size, filestore_max_alloc_hint_size) (1MB for the max by default) for writes to new objects. Depending on how much the benchmark fills the image, this could be a large or small overhead compared to the amount of data written. Josh > Jan > >> On 26 Feb 2016, at 22:05, Josh Durgin <jdurgin@xxxxxxxxxx> wrote: >> >> On 02/24/2016 07:10 PM, Christian Balzer wrote: >>> 10 second rados bench with 4KB blocks, 219MB written in total. >>> nand-writes per SSD:41*32MB=1312MB. >>> 10496MB total written to all SSDs. >>> Amplification:48!!! >>> >>> Le ouch. >>> In my use case with rbd cache on all VMs I expect writes to be rather >>> large for the most part and not like this extreme example. >>> But as I wrote the last time I did this kind of testing, this is an area >>> where caveat emptor most definitely applies when planning and buying SSDs. >>> And where the Ceph code could probably do with some attention. >> >> In this case it's likely rados bench using tiny objects that's >> causing the massive overhead. rados bench is doing each write to a new >> object, which ends up in a new file beneath the osd, with its own >> xattrs too. For 4k writes, that's a ton of overhead. >> >> fio with the rbd backend will give you a more realistic picture. >> In jewel there will be --max-objects and --object-size options for >> rados bench to get closer to an rbd-like workload as well. >> >> Josh _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com