On 06/20/2018 08:30 AM, Ilya Dryomov wrote:
On Wed, Jun 20, 2018 at 1:58 PM Mark Nelson <mark.a.nelson@xxxxxxxxx> wrote:
Hi Lisa,
On 06/20/2018 03:19 AM, xiaoyan li wrote:
Hi all,
I wrote a poc to split pglog from Rocksdb and store them into
standalone space in the block device.
Excellent! This is very exciting!
The updates are done in OSD and BlueStore:
OSD parts:
1. Split pglog entries and pglog info from omaps.
BlueStore:
1. Allocate 16M space in block device per PG for storing pglog.
2. Per every transaction from OSD, combine pglog entries and
pglog info, and write them into a block. The block is set to 4k at
this moment.
Currently, I only make the write workflow work.
With librbd+fio on a cluster with an OSD (on Intel Optane 370G), I got
the following performance for 4k random writes, and the performance
got 13.87% better.
Master:
write: IOPS=48.3k, BW=189MiB/s (198MB/s)(55.3GiB/300009msec)
slat (nsec): min=1032, max=1683.2k, avg=4345.13, stdev=3988.69
clat (msec): min=3, max=123, avg=10.60, stdev= 8.31
lat (msec): min=3, max=123, avg=10.60, stdev= 8.31
Pgsplit branch:
write: IOPS=55.0k, BW=215MiB/s (225MB/s)(62.0GiB/300010msec)
slat (nsec): min=1068, max=1339.7k, avg=4360.58, stdev=3878.47
clat (msec): min=2, max=120, avg= 9.30, stdev= 6.92
lat (msec): min=2, max=120, avg= 9.31, stdev= 6.92
These are better numbers than I typically get! I'll play with your
branch but usually I see us pegged in this workload in the
kv_sync_thread. Did you notice any significant change in CPU consumption?
Here is the POC: https://github.com/lixiaoy1/ceph/commits/pglog-split-fastinfo
The problem is that per every transaction, I use a 4k block to save
the pglog entries and pglog info which is only 130+920 = 1050 bytes.
This wastes a lot of space.
Any suggestions?
I guess 100*3000*4k = ~1.2GB?
What is 100 for? An estimate for a number of PGs on an OSD?
yes, exactly. I think there's a reasonable argument that 100 PGs per
OSD and a log length of 3000 isn't a satisfactory long term target though.
Mark
Thanks,
Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html