On 3/4/24 08:40, Maged Mokhtar wrote:
On 04/03/2024 15:37, Frank Schilder wrote:
Fast write enabled would mean that the primary OSD sends #size
copies to the
entire active set (including itself) in parallel and sends an ACK
to the
client as soon as min_size ACKs have been received from the peers
(including
itself). In this way, one can tolerate (size-min_size) slow(er)
OSDs (slow
for whatever reason) without suffering performance penalties
immediately
(only after too many requests started piling up, which will show
as a slow
requests warning).
What happens if there occurs an error on the slowest osd after the
min_size ACK has already been send to the client?
This should not be different than what exists today..unless
of-course if
the error happens on the local/primary osd
Can this be addressed with reasonable effort? I don't expect this to
be a quick-fix and it should be tested. However, beating the
tail-latency statistics with the extra redundancy should be worth it.
I observe fluctuations of latencies, OSDs become randomly slow for
whatever reason for short time intervals and then return to normal.
A reason for this could be DB compaction. I think during compaction
latency tends to spike.
A fast-write option would effectively remove the impact of this.
Best regards and thanks for considering this!
i think this is something the rados devs need to say. it does sound
worth investigating. it is not just for cases with db compaction but
more importantly the normal(happy) io path as it will have the most
impact.
Typically a L0->L1 compaction will have two primary effects:
1) It will cause large IO read/write traffic to the disk potentially
impacting other IO taking place if the disk is already saturated.
2) It will block memtable flushes until the compaction finishes. This
means that more and more data will accumulate in the memtables/WAL which
can trigger throttling and eventually stalls if you run out of buffer
space. By default, we allow up to 1GB of writes to WAL/memtables before
writes are fully stalled, but RocksDB will typlically throttle writes
before you get to that point. It's possible a larger buffer may allow
you to absorb traffic spikes for longer at the expense of more disk and
memory usage. Ultimately though, if you are hitting throttling, it
means that the DB can't keep up with the WAL ingestion rate.
Mark
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
--
Best Regards,
Mark Nelson
Head of Research and Development
Clyso GmbH
p: +49 89 21552391 12 | a: Minnesota, USA
w: https://clyso.com | e: mark.nelson@xxxxxxxxx
We are hiring: https://www.clyso.com/jobs/
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx