On 6/1/20 11:58 PM, Yiming Zhang wrote:
On Jun 1, 2020, at 5:10 PM, Mark Nelson <mnelson@xxxxxxxxxx
<mailto:mnelson@xxxxxxxxxx>> wrote:
Hi Yiming,
Are you changing the overall data set size when you change the image
size? IE in your 40GB image test, is your data set 40x larger than
in your 1GB image test?
I’m using the same workload:
rw=randwrite
bs=4096
time_based=1
runtime=300
direct = 1
iodepth=48
Both run has the same run time 300s.
Ok, but are you doing the randwrite workload across the entire image in
both cases? If so, that will be many more objects you are spanning
writes across for the 40GB image vs the 1GB image.
That would have various effects, including changing the number of
onodes in the cache and the potential for cache misses hitting
rocksdb and eventually the disk. Having said that, with the default
4GB memory target I wouldn't expect you to have cache misses with
typical RBD workloads even with a 40GB dataset on a single OSD unless
you've tweaked the object size to be smaller or caused additional
metadata per object in some way (EC, etc).
Theoretically you might be able to use lttng or jaeger tracepoints to
track latency, or possible look at the perf counters. Otherwise you
might also be able to see something through wallclock profiling.
I tried the gdb wallclock profiling. The I can only see the fio and
osd related time, not include the bluestore resources. Details please
see here <https://pastebin.com/6UiLRGvY>.
I added bunch of perf counters in BlueStore to track the latencies. I
don’t see any suspicious counters. For locking behavior, is there any
possible reasons for that? Really appreciated if you could point me
which lock you mean in kv_sync_thread.
It looks to me like you ran it against the client fio process rather
than the OSD?
Thanks,
Yiming
I would probably look carefully at things happening in the kv sync
thread since this is a random write workload and that's where I'd
expect to see blocking behavior that could cause latency spikes like
this.
Mark
On 6/1/20 1:50 PM, Yiming Zhang wrote:
Hi All,
I have noticed that different RBD image size can shape the bluestore
latency differently. Is there baseline or guidance for choosing the
image size?
Left: RBD image size is 1GB
middle: RBD image size is 40GB
Right: RBD image size is 1GB, RocksDB write buffer 10X default
4K randwrite on SSD with FIO. SSD is preconditioned and image is
prefilled(20mins).
Red dot is L1 compaction and green dot is L0 compaction.
Let’s focus on the left graph. The smaller spikes are caused by
compactions. The higher spikes seems to be caused by the BlueStore
itself.
I suspect this could be related to RBD image size in someway.
Does anyone know what could the cause of the higher spikes? And how
to debug it?
Also, what is the proper RBD image size for my test?
Please advice.
Thanks,
Yiming
_______________________________________________
Dev mailing list -- dev@xxxxxxx <mailto:dev@xxxxxxx>
To unsubscribe send an email to dev-leave@xxxxxxx
<mailto:dev-leave@xxxxxxx>
_______________________________________________
Dev mailing list -- dev@xxxxxxx <mailto:dev@xxxxxxx>
To unsubscribe send an email to dev-leave@xxxxxxx
<mailto:dev-leave@xxxxxxx>
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx