Re: Re: [RFC v3 01/11] eventfd: track eventfd_signal() recursion depth separately in different cases

Yongji Xie <xieyongji@xxxxxxxxxxxxx> · Thu, 28 Jan 2021 13:12:29 +0800

On Thu, Jan 28, 2021 at 11:08 AM Jens Axboe <axboe@xxxxxxxxx> wrote:
>
> On 1/27/21 8:04 PM, Jason Wang wrote:
> >
> > On 2021/1/27 下午5:11, Yongji Xie wrote:
> >> On Wed, Jan 27, 2021 at 11:38 AM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> >>>
> >>> On 2021/1/20 下午2:52, Yongji Xie wrote:
> >>>> On Wed, Jan 20, 2021 at 12:24 PM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> >>>>> On 2021/1/19 下午12:59, Xie Yongji wrote:
> >>>>>> Now we have a global percpu counter to limit the recursion depth
> >>>>>> of eventfd_signal(). This can avoid deadlock or stack overflow.
> >>>>>> But in stack overflow case, it should be OK to increase the
> >>>>>> recursion depth if needed. So we add a percpu counter in eventfd_ctx
> >>>>>> to limit the recursion depth for deadlock case. Then it could be
> >>>>>> fine to increase the global percpu counter later.
> >>>>> I wonder whether or not it's worth to introduce percpu for each eventfd.
> >>>>>
> >>>>> How about simply check if eventfd_signal_count() is greater than 2?
> >>>>>
> >>>> It can't avoid deadlock in this way.
> >>>
> >>> I may miss something but the count is to avoid recursive eventfd call.
> >>> So for VDUSE what we suffers is e.g the interrupt injection path:
> >>>
> >>> userspace write IRQFD -> vq->cb() -> another IRQFD.
> >>>
> >>> It looks like increasing EVENTFD_WAKEUP_DEPTH should be sufficient?
> >>>
> >> Actually I mean the deadlock described in commit f0b493e ("io_uring:
> >> prevent potential eventfd recursion on poll"). It can break this bug
> >> fix if we just increase EVENTFD_WAKEUP_DEPTH.
> >
> >
> > Ok, so can wait do something similar in that commit? (using async stuffs
> > like wq).
>
> io_uring should be fine in current kernels, but aio would still be
> affected by this. But just in terms of recursion, bumping it one more
> should probably still be fine.
>

OK, I see. It should be easy to avoid the A-A deadlock during coding.

Thanks,
Yongji