Re: Question regarding CDC NCM and VNC performance issue

Maciej Żenczykowski <maze@xxxxxxxxxx> · Tue, 12 Dec 2023 17:11:49 -0800

On Tue, Dec 12, 2023 at 1:21 PM Oliver Neukum <oneukum@xxxxxxxx> wrote:
>
> On 12.12.23 21:32, Hiago De Franco wrote:
>
> Hi,
>
> > On Mon, Dec 11, 2023 at 12:44:42PM -0800, Maciej Żenczykowski wrote:
> >> On Mon, Dec 11, 2023 at 12:29 PM Hiago De Franco <hiagofranco@xxxxxxxxx> wrote:
>
> >> Hiago, could you try lowering CDC_NCM_TIMER_PENDING_CNT, if need be all the way to 1?
> >> It is defined in include/linux/usb/cdc_ncm.h as 3 currently
> >> This applies to the host side.
> >
> > On my side CDC_NCM_TIMER_PENDING_CNT is set to 2 by default, did you
> > mean CDC_NCM_RESTART_TIMER_DATAGRAM_CNT?
>
> Yes, I meant that. Sorry.
>
> > Despite of that, I tried to lower both CDC_NCM_TIMER_PENDING_CNT and
> > CDC_NCM_RESTART_TIMER_DATAGRAM_CNT all the way down to 1, first the
> > CDC_NCM_TIMER_PENDING_CNT, then CDC_NCM_RESTART_TIMER_DATAGRAM_CNT and
> > finally both at the same time, but it didn't help.
> >
> > I've also put some printks as following:
> >
> >       ctx->tx_curr_frame_num = n;
> >       printk("hfranco: tx_curr_frame_num = %d", n);
> >
> >       if (n == 0) {
> >               printk("hfranco: n == 0");
> >               /* wait for more frames */
> >               /* push variables */
> >               ctx->tx_curr_skb = skb_out;
> >               goto exit_no_skb;
> >
> >       } else if ((n < ctx->tx_max_datagrams) && (ready2send == 0) && (ctx->timer_interval > 0)) {
> >               printk("hfranco: tx_max_datagrams = %d", ctx->tx_max_datagrams);
> >               printk("hfranco: timer_interval = %d", ctx->timer_interval);
> >               printk("hfranco: n inside else if = %d", n);
> >               /* wait for more frames */
> >               /* push variables */
> >               ctx->tx_curr_skb = skb_out;
> >               /* set the pending count */
> >               if (n < CDC_NCM_RESTART_TIMER_DATAGRAM_CNT)
> >                       ctx->tx_timer_pending = CDC_NCM_TIMER_PENDING_CNT;
> >               goto exit_no_skb;
> >
> >       } else {
> >               printk("hfranco: n inside else = %d", n);
> >               if (n == ctx->tx_max_datagrams)
> >                       ctx->tx_reason_max_datagram++;  /* count reason for transmitting */
> >
> > I ran it on my host PC, compiled it as module for my Debian dekstop, and

Shouldn't you be doing this on the gadget side?
I thought we were thinking it was the gadget transmit timer having issues.

> > this is the dmesg:
> >
> > [ 9663.478807] hfranco: tx_curr_frame_num = 1
> > [ 9663.478816] hfranco: tx_max_datagrams = 40
> > [ 9663.478818] hfranco: timer_interval = 400000
> > [ 9663.478820] hfranco: n inside else if = 1
> > [ 9663.478822] hfranco: timer started
> > [ 9663.479645] hfranco: tx_curr_frame_num = 1
> > [ 9663.479652] hfranco: n inside else = 1
> >
> > And then it basically repeats. Looks like 'n' never passes the 1 value.
> > By tweaking the flags mentioned before, 'n' got a value of 4, but that
> > was the maximum value. I was wondering, why do you think this code looks
> > suspicious? I basically just inserted some printks on the tx side, I
> > will see if I can get something from the rx as well.
>
> If we look at the statistics you initially gathered, we can see that all transmissions
> on the host side happen because the timeout elapses. That, however, does
> _not_ tell us that the host is to blame. We could look at two possible scenarios
>
> A - the gadget is bundling up the packets with too much delay and the host
> just answers to the megatransmissions with one packet and the delay on the host
> is inconsequential
>
> B - the timer on the host runs for too long or sometimes not at all. If that were
> the case that code I pointed out would be most likely to blame
>
> Could I suggest we try to localize the issue? Can you ping the host from the device?
>
>         Regards
>                 Oliver

--
Maciej Żenczykowski, Kernel Networking Developer @ Google