Re: Pacific: parallel PG reads?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 11, 2021 at 4:54 AM Zakhar Kirpichenko <zakhar@xxxxxxxxx> wrote:
>
> Hi,
>
> I'm still trying to combat really bad read performance from HDD-backed
> replicated pools, which is under 100 MB/s most of the time with 1 thread
> and QD=1. I don't quite understand why the reads are that slow, i.e. much
> slower than a single HDD,

That's not "much slower than a single HDD". A modern single HDD can do
streaming reads up to ~150MB/s (the extremely high-density ones can
reach 200MB/s I think now?). But they can only do ~100 independent
reads/s at arbitrarily high queue depths — if you have QD=1, you
probably won't see that many, because the drive relies on a lot of
head scheduling to try and pull average seek times down.

If you're seeing much higher read rates than that with QD=1, you are
reading from a cache, whether that's the page cache in the Linux
kernel, or the drive's memory cache. Ceph can serve data out of caches
sometimes, but you mostly need to assume it won't do that
independently.

So, probably you are not running analogous tests in some way, and if
you evened them up it'd get closer. (Ceph will still not match a
single local hard drive because it has CPU and network overheads, but
for a hard drive it will get very close). If you talk about the tests
you're running, somebody can probably help you out. If you have a
specific use case that isn't performing the way you want, others may
have managed it before. :)

> but do understand that Ceph clients read a PG
> from primary OSD only.
>
> Since reads are immutable, is it possible to make Ceph clients read PG in a
> RAID1-like fashion, i.e. if a PG has a primary OSD and two replicas, is it
> possible to read all 3 OSDs in parallel for a 3x performance gain?

This is not built in to Ceph right now. It wouldn't be infeasible to
make clients do that, but I wouldn't recommend it for reasons others
have mentioned.
-Greg

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux