Re: [ceph-users] Ceph GET latency

Gregory Farnum <greg@xxxxxxxxxxx> · Thu, 20 Feb 2014 07:16:29 -0800

On Tue, Feb 18, 2014 at 7:24 AM, Guang Yang <yguang11@xxxxxxxxx> wrote:
> Hi ceph-users,
> We are using Ceph (radosgw) to store user generated images, as GET latency
> is critical for us, most recently I did some investigation over the GET path
> to understand where time spend.
>
> I first confirmed that the latency came from OSD (read op), so that we
> instrumented code to trace the GET request (read op at OSD side, to be more
> specific,

How'd you instrument it? Are you aware of the OpTracker system that
records and can output important events?

> each object with size [512K + 4M * x]  are splitted into [1 + x]
> chunks, each chunk needs one read op ), for each read op, it needs to go
> through the following steps:
>     1. Dispatch and take by a op thread to process (process not started).
>              0   - 20 ms,    94%
>              20 - 50 ms,    2%
>              50 - 100 ms,  2%
>               100ms+   ,         2%
>          For those having 20ms+ latency, half of them are due to waiting for
> pg lock (https://github.com/ceph/ceph/blob/dumpling/src/osd/OSD.cc#L7089),
> another half are yet to be investigated.

The PG lock conflict means that there's something else happening in
the PG at the same time; that's a logical contention issue. However,
20ms is a long time, so if you can figure out what else the PG is
doing during that time it'd be interesting.

>
>     2. Get file xattr ('-'), which open the file and populate fd cache
> (https://github.com/ceph/ceph/blob/dumpling/src/os/FileStore.cc#L230).
>               0   - 20 ms,  80%
>               20 - 50 ms,   8%
>               50 - 100 ms, 7%
>               100ms+   ,      5%
>          The latency either comes from (from more to less): file path lookup
> (https://github.com/ceph/ceph/blob/dumpling/src/os/HashIndex.cc#L294), file
> open, or fd cache lookup /add.
>          Currently objects are store in level 6 or level 7 folder (due to
> http://tracker.ceph.com/issues/7207, I stopped folder splitting).

FYI there's been some community and Inktank work to try and speed this
up recently. None of it's been merged into master yet, but we'll
definitely have some improvements to this post-Firefly.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

>
>     3. Get more xattrs, this is fast due to previous fd cache (rarely >
> 1ms).
>
>     4. Read the data.
>             0   - 20 ms,   84%
>             20 - 50 ms, 10%
>             50 - 100 ms, 4%
>             100ms+        , 2%
>
> I decreased vfs_cache_pressure from its default value 100 to 5 to make VFS
> favor dentry/inode cache over page cache, unfortunately it does not help.
>
> Long story short, most of the long latency read op comes from file system
> call (for cold data), as our workload mainly stores objects less than 500KB,
> so that it generates a large bunch of objects.
>
> I would like to ask if people experienced similar issue and if there is any
> suggestion I can try to boost the GET performance. On the other hand, PUT
> could be sacrificed.
>
> Thanks,
> Guang
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html