Re: [ceph-users] fsping, why you no work no mo?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 13, 2017 at 6:41 PM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> Dear ceph-*,
>
> A couple weeks ago I wrote this simple tool to measure the round-trip
> latency of a shared filesystem.
>
>    https://github.com/dvanders/fsping
>
> In our case, the tool is to be run from two clients who mount the same
> CephFS.
>
> First, start the server (a.k.a. the ping reflector) on one machine in a
> CephFS directory:
>
>    ./fsping --server
>
> Then, from another client machine and in the same directory, start the
> fsping client (aka the ping emitter):
>
>     ./fsping --prefix <prefix from the server above>
>
> The idea is that the "client" writes a syn file, the reflector notices it,
> and writes an ack file. The time for the client to notice the ack file is
> what I call the rtt.
>
> And the output looks like normal ping, so that's neat. (The README.md shows
> a working example)
>
>
> Anyway, two weeks ago when I wrote this, it was working very well on my
> CephFS clusters (running 10.2.5, IIRC). I was seeing ~20ms rtt for small
> files, which is more or less what I was expecting on my test cluster.
>
> But when I run fsping today, it does one of two misbehaviours:
>
>   1. Most of the time it just hangs, both on the reflector and on the
> emitter. The fsping processes are stuck in some uninterruptible state --
> only an MDS failover breaks them out. I tried with and without
> fuse_disable_pagecache -- no big difference.
>
>   2. When I increase the fsping --size to 512kB, it works a bit more
> reliably. But there is a weird bimodal distribution with most "packets"
> having 20-30ms rtt, some ~20% having ~5-6 seconds rtt, and some ~5% taking
> ~10-11s. I suspected the mds_tick_interval -- but decreasing that didn't
> help.
>
>
> In summary, if someone is curious, please give this tool a try on your
> CephFS cluster -- let me know if its working or not (and what rtt you can
> achieve with which configuration).
> And perhaps a dev would understand why it is not working with latest jewel
> ceph-fuse / ceph MDS's?

Yes, this immediately seizes up on my development environment (i.e.
master) and shows up as two blocked requests on the MDS.  We have
broken something...

Opened ticket here: http://tracker.ceph.com/issues/19635

John

> Best Regards,
>
> Dan
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux