Re: [PATCH 1/1] rpmsg: virtio_rpmsg_bus - prevent possible race condition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 13, 2023 at 12:10:39PM +0200, Arnaud POULIQUEN wrote:
> 
> 
> On 9/4/23 22:43, Mathieu Poirier wrote:
> > On Mon, Sep 04, 2023 at 03:52:56PM +0200, Arnaud POULIQUEN wrote:
> >> Hello Tim,
> >>
> >> On 9/4/23 10:36, Tim Blechmann wrote:
> >>> when we cannot get a tx buffer (`get_a_tx_buf`) `rpmsg_upref_sleepers`
> >>> enables tx-complete interrupt.
> >>> however if the interrupt is executed after `get_a_tx_buf` and before
> >>> `rpmsg_upref_sleepers` we may mis the tx-complete interrupt and sleep
> >>> for the full 15 seconds.
> >>
> >>
> >> Is there any reason why your co-processor is unable to release the TX RPMSG
> >> buffers for 15 seconds? If not, you should first determine the reason why it is
> >> stalled.
> > 
> > Arnaud's concern is valid.  If the remote processor can't consume a buffer
> > within 15 seconds, something is probably wrong.
> > 
> > That said, I believe your assesment of the situation is correct.  *If* the TX
> > callback is disabled and there is no buffer available, there is a window of
> > opportunity between calls to get_a_tx_buf() and rpmsg_upref_sleepers() for an
> > interrupt to arrive in function rpmsg_send_offchannel_raw().  
> > 
> > From here three things need to happen:
> > 
> > 1) You send another version of this patch with a changelong that uses proper
> > english, i.e capital letters when they are needed and no spelling mistake.
> > 
> > 2) Arnaud confirms our suspicions.
> 
> Seems to me that this patch is useless
> - wait_event_interruptible_timeout() function already seems
> to test the condition (so call get_a_tx_buf()) before entering in sleep[1].
> - ftraces show that vq interrupt is not called during the 15-second period.
>   So it is a normal behavior that the vrp->sendq is never waked-up.
> 

Thanks for looking further into this issue.  I will wait for you guys to get to
the root of the problems.

> Tim needs to analyze the reason why no mailbox interrupt occurs.
> 
> [1]https://elixir.bootlin.com/linux/latest/source/include/linux/wait.h#L534
> 
> 
> > 
> > 3) This patch gets applied when rc1 comes out so that it has 6 or 7 weeks to
> > soak.  No error are locks are reported due to this patch during that time. 
> > 
> >>
> >> Regards,
> >> Arnaud
> >>
> >>>
> >>> in this case, so we re-try once before we really start to sleep
> >>>
> >>> Signed-off-by: Tim Blechmann <tim@xxxxxxxxxx>
> >>> ---
> >>>  drivers/rpmsg/virtio_rpmsg_bus.c | 24 +++++++++++++++---------
> >>>  1 file changed, 15 insertions(+), 9 deletions(-)
> >>>
> >>> diff --git a/drivers/rpmsg/virtio_rpmsg_bus.c b/drivers/rpmsg/virtio_rpmsg_bus.c
> >>> index 905ac7910c98..2a9d42225e60 100644
> >>> --- a/drivers/rpmsg/virtio_rpmsg_bus.c
> >>> +++ b/drivers/rpmsg/virtio_rpmsg_bus.c
> >>> @@ -587,21 +587,27 @@ static int rpmsg_send_offchannel_raw(struct rpmsg_device *rpdev,
> >>>  
> >>>  	/* no free buffer ? wait for one (but bail after 15 seconds) */
> >>>  	while (!msg) {
> >>>  		/* enable "tx-complete" interrupts, if not already enabled */
> >>>  		rpmsg_upref_sleepers(vrp);
> >>>  
> >>> -		/*
> >>> -		 * sleep until a free buffer is available or 15 secs elapse.
> >>> -		 * the timeout period is not configurable because there's
> >>> -		 * little point in asking drivers to specify that.
> >>> -		 * if later this happens to be required, it'd be easy to add.
> >>> -		 */
> >>> -		err = wait_event_interruptible_timeout(vrp->sendq,
> >>> -					(msg = get_a_tx_buf(vrp)),
> >>> -					msecs_to_jiffies(15000));
> >>> +		/* make sure to retry to grab tx buffer before we start waiting */
> >>> +		msg = get_a_tx_buf(vrp);
> >>> +		if (msg) {
> >>> +			err = 0;
> >>> +		} else {
> >>> +			/*
> >>> +			 * sleep until a free buffer is available or 15 secs elapse.
> >>> +			 * the timeout period is not configurable because there's
> >>> +			 * little point in asking drivers to specify that.
> >>> +			 * if later this happens to be required, it'd be easy to add.
> >>> +			 */
> >>> +			err = wait_event_interruptible_timeout(vrp->sendq,
> >>> +						(msg = get_a_tx_buf(vrp)),
> >>> +						msecs_to_jiffies(15000));
> >>> +		}
> >>>  
> >>>  		/* disable "tx-complete" interrupts if we're the last sleeper */
> >>>  		rpmsg_downref_sleepers(vrp);
> >>>  
> >>>  		/* timeout ? */
> >>>  		if (!err) {



[Index of Archives]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Photo Sharing]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux