Re: poor OSD performance using kernel 3.4 => problem found

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Stefan,

Please do share! I was planning on starting out on the wiki and eventually getting these kinds of things into the master docs. If you (and others) have already done testing it would be really interesting to compare experiences. So far I've been just kind of throwing stuff into:

http://ceph.com/wiki/Performance_analysis

In it's current form it's pretty inadequate, but I'm hoping to eventually get back to it. A lot of the work I've been doing recently is looking at underlying FS write behavior (specifically seeks) and if we can get any reasonable improvement through mkfs and mount options.

Mark

On 5/31/12 2:34 AM, Stefan Majer wrote:
Hi,

if Stefan confirms this as a solution it might me a good idea to collect some performance optimizations hints for osds to http://ceph.com/docs/master
probably seperated in:

Gigabit Ethernet based deployments
 with Jumbo Frames

 without Jumbo Frames
10 Gigabit Ethernet based deployments
 with Jumbo Frames

 without Jumbo Frames

I can share some of our configurations as well

Greetings
Stefan

On Thu, May 31, 2012 at 9:30 AM, Yehuda Sadeh <yehuda@xxxxxxxxxxx <mailto:yehuda@xxxxxxxxxxx>> wrote:

    On Thu, May 31, 2012 at 12:10 AM, Stefan Priebe - Profihost AG
    <s.priebe@xxxxxxxxxxxx <mailto:s.priebe@xxxxxxxxxxxx>> wrote:
    > Hi Marc, Hi Stefan,
    >
    > first thanks for all your help and time.
    >
    > I found the commit which results in this problem and it is TCP
    related
    > but i'm still wondering if the expected behaviour of this commit is
    > expected?
    >
    > The commit in question is:
    > git show c43b874d5d714f271b80d4c3f49e05d0cbf51ed2
    > commit c43b874d5d714f271b80d4c3f49e05d0cbf51ed2
    > Author: Jason Wang <jasowang@xxxxxxxxxx
    <mailto:jasowang@xxxxxxxxxx>>
    > Date:   Thu Feb 2 00:07:00 2012 +0000
    >
    >    tcp: properly initialize tcp memory limits
    >
    >    Commit 4acb4190 tries to fix the using uninitialized value
    >    introduced by commit 3dc43e3,  but it would make the
    >    per-socket memory limits too small.
    >
    >    This patch fixes this and also remove the redundant codes
    >    introduced in 4acb4190.
    >
    >    Signed-off-by: Jason Wang <jasowang@xxxxxxxxxx
    <mailto:jasowang@xxxxxxxxxx>>
    >    Acked-by: Glauber Costa <glommer@xxxxxxxxxxxxx
    <mailto:glommer@xxxxxxxxxxxxx>>
    >    Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx
    <mailto:davem@xxxxxxxxxxxxx>>
    >
    > diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
    > index 4cb9cd2..7a7724d 100644
    > --- a/net/ipv4/sysctl_net_ipv4.c
    > +++ b/net/ipv4/sysctl_net_ipv4.c
    > @@ -778,7 +778,6 @@ EXPORT_SYMBOL_GPL(net_ipv4_ctl_path);
    >  static __net_init int ipv4_sysctl_init_net(struct net *net)
    >  {
    >        struct ctl_table *table;
    > -       unsigned long limit;
    >
    >        table = ipv4_net_table;
    >        if (!net_eq(net, &init_net)) {
    > @@ -815,11 +814,6 @@ static __net_init int
    ipv4_sysctl_init_net(struct
    > net *net)
    >        net->ipv4.sysctl_rt_cache_rebuild_count = 4;
    >
    >        tcp_init_mem(net);
    > -       limit = nr_free_buffer_pages() / 8;
    > -       limit = max(limit, 128UL);
    > -       net->ipv4.sysctl_tcp_mem[0] = limit / 4 * 3;
    > -       net->ipv4.sysctl_tcp_mem[1] = limit;
    > -       net->ipv4.sysctl_tcp_mem[2] =
    net->ipv4.sysctl_tcp_mem[0] * 2;
    >
    >        net->ipv4.ipv4_hdr = register_net_sysctl_table(net,
    >                        net_ipv4_ctl_path, table);
    > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
    > index a34f5cf..37755cc 100644
    > --- a/net/ipv4/tcp.c
    > +++ b/net/ipv4/tcp.c
    > @@ -3229 <tel:3229>,7 +3229,6 @@ __setup("thash_entries=",
    set_thash_entries);
    >
    >  void tcp_init_mem(struct net *net)
    >  {
    > -       /* Set per-socket limits to no more than 1/128 the pressure
    > threshold */
    >        unsigned long limit = nr_free_buffer_pages() / 8;
    >        limit = max(limit, 128UL);
    >        net->ipv4.sysctl_tcp_mem[0] = limit / 4 * 3;
    > @@ -3298 <tel:3298>,7 +3297,8 @@ void __init tcp_init(void)
    >        sysctl_max_syn_backlog = max(128, cnt / 256);
    >
    >        tcp_init_mem(&init_net);
    > -       limit = nr_free_buffer_pages() / 8;
    > +       /* Set per-socket limits to no more than 1/128 the pressure
    > threshold */
    > +       limit = nr_free_buffer_pages() << (PAGE_SHIFT - 10);
    >        limit = max(limit, 128UL);
    >        max_share = min(4UL*1024*1024, limit);
    >
    Yeah, this might have affected the tcp performance. Looking at the
    current linus tree this function looks more like it looked beforehand,
    so it was probable reverted this way or another.

    Yehuda




--
Stefan Majer


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux