RE: [PATCH for-next 15/16] RDMA/rxe: Add workqueue support for tasks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 19, 2022 12:18 AM Bob Pearson wrote:
> On 10/18/22 03:59, Leon Romanovsky wrote:
> > On Mon, Oct 17, 2022 at 11:33:46PM -0500, Bob Pearson wrote:
> >> Add a third task type RXE_TASK_TYPE_WORKQUEUE to rxe_task.c.
> >
> > Why do you need an extra type and not instead of RXE_TASK_TYPE_TASKLET?
> 
> It performs much better in some settings.
> >
> >>
> >> Signed-off-by: Ian Ziemba <ian.ziemba@xxxxxxx>
> >> Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx>
> >> ---
> >>  drivers/infiniband/sw/rxe/rxe.c      |  9 ++-
> >>  drivers/infiniband/sw/rxe/rxe_task.c | 84 ++++++++++++++++++++++++++++
> >>  drivers/infiniband/sw/rxe/rxe_task.h | 10 +++-
> >>  3 files changed, 101 insertions(+), 2 deletions(-)
> >
> > <...>
> >
> >> +static struct workqueue_struct *rxe_wq;
> >> +
> >> +int rxe_alloc_wq(void)
> >> +{
> >> +	rxe_wq = alloc_workqueue("rxe_wq", WQ_MEM_RECLAIM |
> >> +				WQ_HIGHPRI | WQ_CPU_INTENSIVE |
> >> +				WQ_SYSFS, WQ_MAX_ACTIVE);
> >
> > Are you sure that all these flags can be justified? WQ_MEM_RECLAIM?
> 
> Not really. CPU intensive is most likely correct. The rest not so much.

I doubt this workqueue executes works in the queued order. If the order is changed
unexpectedly on responder, that may cause a failure or unexpected results.
Perhaps, we should make it equivalent to alloc_ordered_workqueue() as shown below:
=== workqueue.h===
#define alloc_ordered_workqueue(fmt, flags, args...)                    \
        alloc_workqueue(fmt, WQ_UNBOUND | __WQ_ORDERED |                \
                        __WQ_ORDERED_EXPLICIT | (flags), 1, ##args)
===

Daisuke

> >
> >> +
> >> +	if (!rxe_wq)
> >> +		return -ENOMEM;
> >> +
> >> +	return 0;
> >> +}
> >
> > <...>
> >
> >> +static void work_sched(struct rxe_task *task)
> >> +{
> >> +	if (!task->valid)
> >> +		return;
> >> +
> >> +	queue_work(rxe_wq, &task->work);
> >> +}
> >> +
> >> +static void work_do_task(struct work_struct *work)
> >> +{
> >> +	struct rxe_task *task = container_of(work, typeof(*task), work);
> >> +
> >> +	if (!task->valid)
> >> +		return;
> >
> > How can it be that submitted task is not valid? Especially without any
> > locking.
> 
> This and a similar subroutine for tasklets are called deferred and can have a significant
> delay before being called. In the mean time someone could have tried to destroy the QP. The valid
> flag is only cleared by QP destroy code and is not turned back on. Perhaps a rmb().
> >
> >> +
> >> +	do_task(task);
> >> +}
> >
> > Thanks
> >
> >> +





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux