Fwd: connect() issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[Forwarding to linux-sctp]

-------- Original Message --------
Subject: connect() issues
Date: Sun, 31 Aug 2014 12:10:58 -0400
From: Jamal Hadi Salim <jhs@xxxxxxxxxxxx>
To: lksctp-developers@xxxxxxxxxxxxxxxxxxxxx
CC: Vlad Yasevich <vyasevic@xxxxxxxxxx>,        Michael Tuexen
<Michael.Tuexen@xxxxxxxxxxxxxxxxx>

Folks,

I have attached a small program written by Michael Tuexen and modified
slightly by me to demonstrate the issue. It demonstrates memory issues
due to connect(). Sorry, you will need libev..
(I had to extract details out of a large complex program).

Summary:
=======
There is a kernel issue where each connect() call results in
sctp_association_new() where memory is allocated. An INIT goes
out to remote and an ABORT comes back. But the allocated mem
is never freed. I thought because i registered for association
events i could get these events sent to me - but recvmsg fails
every time and no readability state is set on the socket.

If you run this long enough(24 hours or so) you will see the oom
killer come in upset about sctp_association_new():

---
Call Trace:
[<ffffffff80145508>] show_stack+0x68/0x80
[<ffffffff8061e9c8>] dump_header.isra.12+0x78/0x1ac
[<ffffffff801d2358>] oom_kill_process+0x2e8/0x440
[<ffffffff801d2998>] out_of_memory+0x2b8/0x2e8
[<ffffffff801d7084>] __alloc_pages_nodemask+0x774/0x788
[<ffffffff80210c60>] cache_alloc_refill+0x470/0x7b0
[<ffffffff802107c4>] kmem_cache_alloc+0xe4/0x110
[<ffffffffc008a214>] sctp_association_new+0x54/0x688 [sctp]
[<ffffffffc009c92c>] __sctp_connect+0x274/0x618 [sctp]
[<ffffffffc009ce84>] sctp_connect+0x7c/0xe8 [sctp]
[<ffffffff8053d030>] SyS_connect+0xd8/0xf8
[<ffffffff8014a0a4>] handle_sys64+0x44/0x68
-----

I am sorry I dont have time to chase the kernel code
(and will have to work around it in user space in our code).

Longer version:
==============

Attached program initially tries to connect to a server which is not up
yet. At some point the server comes up and all the issues i observe
go away i.e resulting memory consumption goes to zero.

The issue i am about to describe happens on all kernel versions i have
tested on (including latest and all the way back to 2.6.32 running on
a MIPS board).

How to observe the issue:
on xterm 1:
sudo watch "cat /proc/slabinfo | grep -i ^kmalloc-"

on xterm 2:
run the attached program.

In my laptop the pages are 4K, so i would see kmalloc-4096 consumption
going up.

If you want actually to narrow this down - then compile the kernel with
CONFIG_SCTP_DBG_OBJCNT (or you can believe what i am saying below).
do a:

----
Every 2.0s: sudo cat /proc/net/sctp/sctp_dbg_objcnt     Fri Aug 29
11:34:35 2014
sock: 5
ep: 5
assoc: 279
transport: 1
chunk: 0
bind_addr: 0
bind_bucket: 3
addr: 4
ssnmap: 0
datamsg: 0
------

And

When i start the server 3-4 minutes later and the two ends talk to each
other,
the counters go down:

---
Every 2.0s: sudo cat /proc/net/sctp/sctp_dbg_objcnt     Fri Aug 29
11:37:38 2014
sock: 12
ep: 12
assoc: 6
transport: 6
chunk: 0
bind_addr: 0
bind_bucket: 7
addr: 16
ssnmap: 6
datamsg: 0
-------------

cheers,
jamal



/*
 * gcc connect_test.c -lev
*/

/*-
 * Copyright (c) 2014 Michael Tuexen
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 *
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 *
*/
 
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netinet/sctp.h>
#include <arpa/inet.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <ev.h>
#include <errno.h>

#define PORT 9
#define FEAPP_PORT 30330
#define ADDR "127.0.0.1"

struct sockaddr_in addr;
int fd;


/* the timer being fired seems to help creating the issue */
void timeout_cb (EV_P_ ev_timer *w, int revents)
{
	int flags;
	ssize_t rc;

	flags = fcntl(fd, F_GETFL, 0);
	if (fcntl(fd, F_SETFL, flags  | O_NONBLOCK) < 0) {
		perror("fcntl");
	}

	if (connect(fd, (const struct sockaddr *)&addr, sizeof(struct sockaddr_in)) < 0) {
		perror("connect");
	}

	w->repeat = 0.1;
	ev_timer_again(EV_A_ w);
}

int
main(void)
{
	int i, rc;
	int tr;
	struct sctp_event_subscribe event;
	ev_timer timeout_watcher;
	struct ev_loop *loop = EV_DEFAULT;

	if ((fd = socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP)) < 0) {
		perror("socket");
	}

	if (setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &tr, sizeof(int)) < 0) {
		perror("SO_REUSEADDR");
		return -1;
	}

        memset(&event, 0, sizeof(event));
	event.sctp_association_event = 1;

	/*XXX: if you dont subscribe to events all goes well...*/
	rc = setsockopt(fd, IPPROTO_SCTP, SCTP_EVENTS, &event, sizeof(event));

	memset(&addr, 0, sizeof(struct sockaddr_in));
	addr.sin_family = AF_INET;
#if defined(__FreeBSD__) || defined(__APPLE__)
	addr.sin_len = sizeof(struct sockaddr_in);
#endif
	addr.sin_port = htons(PORT);
	addr.sin_addr.s_addr = inet_addr(ADDR);


	/* run the callback every 0.1 seconds */
	ev_init (&timeout_watcher, timeout_cb);
	timeout_watcher.repeat = 0.1;
	ev_timer_start (loop, &timeout_watcher);

	ev_run (loop, 0);

	if (close(fd) < 0) {
		perror("close");
	}
	return (0);
}


[Index of Archives]     [Linux Networking Development]     [Linux OMAP]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux