Re: [PATCH] netfilter: finer grained nf_conn locking

Stephen Hemminger <shemminger@xxxxxxxxxx> · Mon, 30 Mar 2009 13:05:32 -0700



On Mon, 30 Mar 2009 21:57:15 +0200
Eric Dumazet <dada1@xxxxxxxxxxxxx> wrote:

> Stephen Hemminger a écrit :
> > On Sat, 28 Mar 2009 17:55:38 +0100
> > Eric Dumazet <dada1@xxxxxxxxxxxxx> wrote:
> > 
> >> Eric Dumazet a écrit :
> >>> Patrick McHardy a écrit :
> >>>> Stephen Hemminger wrote:
> >>>>
> >>>>> @@ -50,6 +50,7 @@ struct ip_ct_tcp_state {
> >>>>>  
> >>>>>  struct ip_ct_tcp
> >>>>>  {
> >>>>> +    spinlock_t    lock;
> >>>>>      struct ip_ct_tcp_state seen[2];    /* connection parameters per
> >>>>> direction */
> >>>>>      u_int8_t    state;        /* state of the connection (enum
> >>>>> tcp_conntrack) */
> >>>>>      /* For detecting stale connections */
> >>>> Eric already posted a patch to use an array of locks, which is
> >>>> a better approach IMO since it keeps the size of the conntrack
> >>>> entries down.
> >>> Yes, we probably can use an array for short lived lock sections.
> > 
> > I am not a fan of the array of locks. Sizing it is awkward and
> > it is vulnerable to hash collisions. Let's see if there is another
> > better way.
> 
> On normal machines, (no debugging spinlocks), patch uses an embedded
> spinlock. We probably can use this even on 32bit kernels, considering
> previous patch removed the rcu_head (8 bytes on 32bit arches) from
> nf_conn :)
> 
> if LOCKDEP is on, size of a spinlock is 64 bytes on x86_64.
> Adding a spinlock on each nf_conn would be too expensive. In this
> case, an array of spinlock is a good compromise, as done in
> IP route cache, tcp ehash, ...
> 
> I agree sizing of this hash table is not pretty, and should be
> a generic kernel service (I wanted such service for futexes for example)
> 

IMO having different locking based on lockdep and architecture is an invitation
to future obscure problems. Perhaps some other locking method or shrinking
ct entry would be better.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html