Re: fsping, why you no work no mo?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dan,

I don't have a solution to the problem, I can only second that we've also been seeing strange problems when more than one node accesses the same file in ceph and at least one of them opens it for writing.  I've tried verbose logging on the client (fuse), and it seems that the fuse client sends some cap request to the MDS and does not get a response sometimes.  And it looks like it has some 5 second polling interval, and that sometimes (but not always) saves the day and the client continues with a 5 second-ish delay.  This does not happen when multiple processes open the file for reading, but it does when processes open it for writing (even if they never write to the file and only read afterwards).  I have some earlier mailing list messages from a week or two ago describing what we see more in detail (including log outputs).  I think the issue has in some way to do with cap requests being lost/miscommunicated between the client and the MDS.

Andras


On 04/13/2017 01:41 PM, Dan van der Ster wrote:
Dear ceph-*,

A couple weeks ago I wrote this simple tool to measure the round-trip latency of a shared filesystem.

   https://github.com/dvanders/fsping

In our case, the tool is to be run from two clients who mount the same CephFS.

First, start the server (a.k.a. the ping reflector) on one machine in a CephFS directory:

   ./fsping --server

Then, from another client machine and in the same directory, start the fsping client (aka the ping emitter):

    ./fsping --prefix <prefix from the server above>

The idea is that the "client" writes a syn file, the reflector notices it, and writes an ack file. The time for the client to notice the ack file is what I call the rtt.

And the output looks like normal ping, so that's neat. (The README.md shows a working example)


Anyway, two weeks ago when I wrote this, it was working very well on my CephFS clusters (running 10.2.5, IIRC). I was seeing ~20ms rtt for small files, which is more or less what I was expecting on my test cluster.

But when I run fsping today, it does one of two misbehaviours:

  1. Most of the time it just hangs, both on the reflector and on the emitter. The fsping processes are stuck in some uninterruptible state -- only an MDS failover breaks them out. I tried with and without fuse_disable_pagecache -- no big difference.

  2. When I increase the fsping --size to 512kB, it works a bit more reliably. But there is a weird bimodal distribution with most "packets" having 20-30ms rtt, some ~20% having ~5-6 seconds rtt, and some ~5% taking ~10-11s. I suspected the mds_tick_interval -- but decreasing that didn't help.


In summary, if someone is curious, please give this tool a try on your CephFS cluster -- let me know if its working or not (and what rtt you can achieve with which configuration).
And perhaps a dev would understand why it is not working with latest jewel ceph-fuse / ceph MDS's?

Best Regards,

Dan




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux