> On Oct 7, 2015, at 10:39 AM, Sagi Grimberg <sagig@xxxxxxxxxxxxxxxxxx> wrote: > > On 10/6/2015 5:59 PM, Chuck Lever wrote: >> The reply tasklet is fast, but it's single threaded. After reply >> traffic saturates a single CPU, there's no more reply processing >> capacity. >> >> Replace the tasklet with a workqueue to spread reply handling across >> all CPUs. This also moves RPC/RDMA reply handling out of the soft >> IRQ context and into a context that allows sleeps. > > Hi Chuck, > > I'm probably missing something here, but do you ever schedule in > the workqueue context? Don't you need to explicitly schedule after > a jiffie or so the code works also in a non fully preemptable kernel? Each RPC reply gets its own work request. This is unlike the tasklet, which continues to run as long as there are items on xprtrdma’s global tasklet queue. I can’t think of anything in the current reply handler that would take longer than a few microseconds to run, unless there is lock contention on the transport_lock. wake_up_bit can also be slow sometimes, but it schedules internally. Eventually the reply handler will also synchronously perform R_key invalidation. In that case, I think there will be an implicit schedule while waiting for the invalidation to finish. -— Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html