Re: 10/14/2014 Weekly Ceph Performance Meeting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 23, 2014 at 5:53 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Thu, 16 Oct 2014, Chen, Xiaoxi wrote:
>> >Hi, indeed qemu use a single thread per queue.
>> >I think it's not a problem with common storage (nfs,scsi,..) because they use less cpu ressource than librbd, so you can reach
>> >easily > 100000iops with 1thread.
>>
>> I think it's a common problem shared by all storage backend(nfs,scsi),
>> but Ceph take longer time to sent out an IO(30us), while NFS and scsi is
>> really simple(no crush, no stripping, etc) that may only take 3us to
>> send out an IO, so the upper bound is 10X than ceph.
>
> I would be very interested in seeing where the CPU time is actually spent.
> I know there are is some locking contention in librbd, but with the
> librados changes that layer at least should have have a much lower
> overhead.  We also should be preserving mappings for PGs in most cases to
> avoid much time spent in CRUSH.  It would be very interesting to be
> proven wrong, though!

Measuring both should be simple. pref is a really great at getting
these kinds of answers quickly. The CPU time part is pretty easy
measure so I'll skip that. And to measure contention you can have pref
watch for the sys:futex_enter tracepoint event. This way you measure
all the times a lock was content... and not only which lock but also
what code path.

Make sure you compile the application (or at very least the library)
without omit-frame pointer gcc flag (it's default on x86_64). And then
the perf command to do that is:

shell $ perf record -g -e syscalls:sys_exit_futex ./my_appname
shell $ perf report

You should have an answer to contention points at the end of your
test. I would offer to help more, but don't use RBD and have no
experience with those parts.

Make sure you're using a newish kernel / newish perf with support for
tracepoints, and you make sure you have the right permission to use
tracepoint events in perf.

>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Milosz Tanski
CTO
16 East 34th Street, 15th floor
New York, NY 10016

p: 646-253-9055
e: milosz@xxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux