[PATCH 2.6.21.5-rt17] IPV6: estalished connections are not shown with "cat /proc/net/tcp6"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I found an issue regarding networking in the real-time patch (patch-2.6.21.5-rt17).
The issue happens only with the kernel, which the real-time patch was applied.
However, the latest stable main kernel (2.6.21.5) doesn't have the same issue.
Therefore, please don't transfer this report to netdev.

The detail of issue is below.
I ran my test program, which is a very simple IPv6 client-server program.
These programs establish a TCP/IPv6 connection between two hosts, and then sleep,
like following diagram.
And then, the problem appears with "cat /proc/net/tcp6".

    serverA          serverB
       |      SYN       |
       +--------------->+
       |    SYN/ACK     |
       +<---------------+
       |      ACK       |
       +--------------->+
       |                |
     sleep...         sleep...
       |                |

When I "cat /proc/net/tcp6" on serverA while establishing connection between serverA and B,
the established connections are not shown.
If you need my test program, please let me know. I can provide it to you.

However, in case of the main-line kernel, the established connections will be
shown appropriately with "cat /proc/net/tcp6". It's different because the
real-time patch has implemented a new socket lookup mechanism for a high-latency.
So, real-time patch has a different mechanism from main-line kernel.

The real-time patch, which implemented a new socket lookup mechanism is using
bitmap(ebitmask). When establishing TCP connection, it sets a flag bit into
the bitmap like followings.

[ebitmask in struct inet_hashinfo]
 Before connecting
  0000000000000000000000000000000000000000000000000000000000000000
 After connecting
  0000001000000000000000000000000000000000000000000000000000000000
        ^

And when reading "/proc/net/tcp and tcp6", the kernel searches the currently active
TCP connections with reference to the bitmap.

However, the kernel can't search the active TCP/IPv6 connection in established state.
It is because the kernel doesn't set a flag bit when establishing TCP/IPv6 connection.
In case of TCP/IPv4, __inet_hash() sets the flag bit properly with __inet_hash_setbit().
But, in case of TCP/IPv6, the setting the flag bit is missing in __inet6_hash().

[include/net/inet_hashtables.h]
static inline void __inet_hash(struct inet_hashinfo *hashinfo,
                               struct sock *sk, const int listen_possible)
{
        struct hlist_head *list;
        rwlock_t *lock;
        unsigned long *bitmask = NULL;
        unsigned int index = 0;

        BUG_TRAP(sk_unhashed(sk));
        if (listen_possible && sk->sk_state == TCP_LISTEN) {
                list = &hashinfo->listening_hash[inet_sk_listen_hashfn(sk)];
                lock = &hashinfo->lhash_lock;
                inet_listen_wlock(hashinfo);
        } else {
                struct inet_ehash_bucket *head;
                sk->sk_hash = inet_sk_ehashfn(sk);
                index = inet_ehash_index(hashinfo, sk->sk_hash);
                head = inet_ehash_bucket(hashinfo, sk->sk_hash);
                list = &head->chain;
                lock = &head->lock;
                bitmask = hashinfo->ebitmask;
                write_lock(lock);
        }
        __sk_add_node(sk, list);
        __inet_hash_setbit(bitmask, index);
        sock_prot_inc_use(sk->sk_prot);
        write_unlock(lock);
        if (listen_possible && sk->sk_state == TCP_LISTEN)
                wake_up(&hashinfo->lhash_wait);
}

[net/ipv6/inet6_hashtables.c]
void __inet6_hash(struct inet_hashinfo *hashinfo,
                                struct sock *sk)
{
        struct hlist_head *list;
        rwlock_t *lock;

        printk("__inet6_hash hit\n");

        BUG_TRAP(sk_unhashed(sk));

        if (sk->sk_state == TCP_LISTEN) {
                list = &hashinfo->listening_hash[inet_sk_listen_hashfn(sk)];
                lock = &hashinfo->lhash_lock;
                inet_listen_wlock(hashinfo);
        } else {
                unsigned int hash;
                sk->sk_hash = hash = inet6_sk_ehashfn(sk);
                hash &= (hashinfo->ehash_size - 1);
                list = &hashinfo->ehash[hash].chain;
                lock = &hashinfo->ehash[hash].lock;
                write_lock(lock);
        }

        __sk_add_node(sk, list);
        sock_prot_inc_use(sk->sk_prot);
        write_unlock(lock);
}

So, I suggest a following change.
The change is to set the flag bit appropriately in __inet6_hash().

Signed-off-by: Masayuki Nakagawa <nakagawa.msy@xxxxxxxxxxxxxx>


Index: linus-kernel.git/net/ipv6/inet6_hashtables.c
===================================================================
--- linus-kernel.git.orig/net/ipv6/inet6_hashtables.c
+++ linus-kernel.git/net/ipv6/inet6_hashtables.c
@@ -27,6 +27,8 @@ void __inet6_hash(struct inet_hashinfo *
 {
 	struct hlist_head *list;
 	rwlock_t *lock;
+	unsigned long *bitmask = NULL;
+	unsigned int index = 0;

 	BUG_TRAP(sk_unhashed(sk));

@@ -35,15 +37,16 @@ void __inet6_hash(struct inet_hashinfo *
 		lock = &hashinfo->lhash_lock;
 		inet_listen_wlock(hashinfo);
 	} else {
-		unsigned int hash;
-		sk->sk_hash = hash = inet6_sk_ehashfn(sk);
-		hash &= (hashinfo->ehash_size - 1);
-		list = &hashinfo->ehash[hash].chain;
-		lock = &hashinfo->ehash[hash].lock;
+		sk->sk_hash = inet6_sk_ehashfn(sk);
+		index = inet_ehash_index(hashinfo, sk->sk_hash);
+		list = &hashinfo->ehash[index].chain;
+		lock = &hashinfo->ehash[index].lock;
+		bitmask = hashinfo->ebitmask;
 		write_lock(lock);
 	}

 	__sk_add_node(sk, list);
+	__inet_hash_setbit(bitmask, index);
 	sock_prot_inc_use(sk->sk_prot);
 	write_unlock(lock);
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux