On Mon, 2015-04-27 at 21:05 +0200, Mateusz Guzik wrote: > On Tue, Apr 21, 2015 at 09:59:28PM -0700, Eric Dumazet wrote: > > From: Eric Dumazet <edumazet@xxxxxxxxxx> > > > > Mateusz Guzik reported : > > > > Currently obtaining a new file descriptor results in locking fdtable > > twice - once in order to reserve a slot and second time to fill it. > > > > Holding the spinlock in __fd_install() is needed in case a resize is > > done, or to prevent a resize. > > > > Mateusz provided an RFC patch and a micro benchmark : > > http://people.redhat.com/~mguzik/pipebench.c > > > > A resize is an unlikely operation in a process lifetime, > > as table size is at least doubled at every resize. > > > > We can use RCU instead of the spinlock : > > > > __fd_install() must wait if a resize is in progress. > > > > The resize must block new __fd_install() callers from starting, > > and wait that ongoing install are finished (synchronize_sched()) > > > > resize should be attempted by a single thread to not waste resources. > > > > rcu_sched variant is used, as __fd_install() and expand_fdtable() run > > from process context. > > > > It gives us a ~30% speedup using pipebench with 16 threads, and a ~10% > > speedup with another program using TCP sockets. > > > > Signed-off-by: Eric Dumazet <edumazet@xxxxxxxxxx> > > Reported-by: Mateusz Guzik <mguzik@xxxxxxxxxx> > > Reviewed-by: Mateusz Guzik <mguzik@xxxxxxxxxx> > Tested-by: Mateusz Guzik <mguzik@xxxxxxxxxx> > Thanks Mateusz I'll send a v2, replacing the READ_ONCE() by more appropriate rcu_dereference_sched() : > > new_fdt->max_fds = NR_OPEN_DEFAULT; > > @@ -553,11 +572,20 @@ void __fd_install(struct files_struct *files, unsigned int fd, > > struct file *file) > > { > > struct fdtable *fdt; > > - spin_lock(&files->file_lock); > > - fdt = files_fdtable(files); > > + > > + rcu_read_lock_sched(); > > + > > + while (unlikely(files->resize_in_progress)) { > > + rcu_read_unlock_sched(); > > + wait_event(files->resize_wait, !files->resize_in_progress); > > + rcu_read_lock_sched(); > > + } > > + /* coupled with smp_wmb() in expand_fdtable() */ > > + smp_rmb(); > > + fdt = READ_ONCE(files->fdt); This should be : fdt = rcu_dereference_sched(files->fdt); > > BUG_ON(fdt->fd[fd] != NULL); > > rcu_assign_pointer(fdt->fd[fd], file); > > - spin_unlock(&files->file_lock); > > + rcu_read_unlock_sched(); > > } > > -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html