On Fri, Aug 24, 2018 at 1:20 AM Stefan Kooman <stefan@xxxxxx> wrote:
Hi Gregory,
Quoting Gregory Farnum (gfarnum@xxxxxxxxxx):
> This is quite strange. Given that you have a log, I think what you want to
> do is find one request in the log, trace it through its lifetime, and see
> where the time is elapsed. You may find a bifurcation, where some
> categories of requests happen instantly but other categories take a second
> or more; focus on the second, obviously.
So that is what I did. Turns out it's not the (slow) cache at all, probably
not to your surprise. The reads are quit fast actually, compared to
kernel client it's ~ 8 ms slower, or ~ 40%. It looks like couple
of writes / updates to, at least a session file, are slow:
2018-08-23 16:40:25.631 7f79156a8700 10 client.15158830 put_inode on
0x10000693859.head(faked_ino=0 ref=5 ll_ref=1 cap_refs={} open={3=1}
mode=100600 size=0/4194304 nlink=1 btime=2018-08-23 16:40:25.632601
mtime=2018-08-23 16:40:25.632601 ctime=2018-08-23 16:40:25.632601
caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000693859 ts 0/0
objects 0 dirty_or_tx 0]
parents=0x1000068547c.head["sess_ia0agoj01ul4rob7ji55ouca41"]
0x5646ff0e8000)
2018-08-23 16:40:28.547 7f79156a8700 10 client.15158830
update_inode_file_time 0x10000693859.head(faked_ino=0 ref=4 ll_ref=1
cap_refs={} open={3=1} mode=100600 size=0/4194304 nlink=1
btime=2018-08-23 16:40:25.632601 mtime=2018-08-23 16:40:25.632601
ctime=2018-08-23 16:40:25.632601
caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000693859 ts 0/0
objects 0 dirty_or_tx 0]
parents=0x1000068547c.head["sess_ia0agoj01ul4rob7ji55ouca41"]
0x5646ff0e8000) pAsxLsXsxFsxcrwb ctime 2018-08-23 16:40:25.632601 mtime
2018-08-23 16:40:25.632601
Hmm, these aren't actually the start and end times to the same operation. put_inode() is literally adjusting a refcount, which can happen for reasons ranging from the VFS doing something that drops it to an internal operation completing to a response coming back from the MDS. You should be able to find requests coming in from the kernel and a response going back out (the function names will be prefixed with "ll_", eg "ll_lookup").
So, almost 3 seconds. Page is only served after this, and possibly, after
some cache files have been written. Note though that ceph-fuse is in
debug=20 mode. Apparently the kernel client is _much_ faster in writing
than ceph-fuse. If I write a file with "dd" (from /dev/urandom) it's in
the tens of milliseconds range, not seconds. atime / ctime changes are
handled in < 5 ms.
I wonder if tuning file-striping [1] with stripe units of 4KB would be
beneficial in this case.
Gr. Stefan
[1]: http://docs.ceph.com/docs/master/dev/file-striping/
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com