RE: [PATCH] staging: dwc2: add microframe scheduler from downstream Pi kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Matthijs Kooijman [mailto:matthijs@xxxxxxxx]
> Sent: Thursday, July 18, 2013 10:19 AM
> 
> This seems related to a patch I made last year for the dwc_otg driver.
> On RT3052, we only have 4 host channels, so it was easy to run out of
> them using a 3G stick and a hub. The 3G sticks hog up 2 host channels
> because they keep NAKing their bulk in transfers and 2 for intr
> endpoints, leaving none for any other transfers. I created a patch to
> allow canceling a NAKing host channel, but I suspect that this
> microframe scheduler might help in this case as well, because it allows
> sharing a single host channel for all periodic transfers, I think,
> leaving the other three for bulk transfers. It might still be useful to
> forward port my patch at some point, since that would in theory allow
> multiple blocking endpoints to be used at the same time.

Sure.

> It seems that this patch doesn't include just the microframe scheduling,
> but also the NAK holdoff patch (which was a separate patch in the
> downstream kernel IIRC and seems like a separate feature in any case)
> and other unrelated and unused bits.

Yep. I thought it was part of the microframe scheduler, but I see
now it's not. I will split that into a separate patch.

> Furthermore, I wonder about how this scheduler works exactly. What I see
> is:
>  - the number usecs needed for a single transfer in a periodic qh is
>    calculated
>  - When the qh is scheduled, the first of the 8 microframes with enough
>    usecs available is picked for this qh (disregarding FS transfers that
>    don't fit in 1 microframe for now)
> 
> However, this seems correct only for endpoints with an interval of 8
> microframes? If an isoc endpoint has an interval of 1, it should be
> scheduled in _every_ microframe, right?
> 
> The old code was pessimistic (the dwc2_check_periodic_bandwidth
> essentially assumes an interval of 1 for everything) but the new code is
> actually too optimistic (as it essentially assumes an interval of 8 for
> everything).
> 
> This leads me to believe that this code works because it makes the
> scheduler way to optimistic and because it removes the "one channel per
> periodic endpoint" policy (which is way too optimistic), but it would
> break when adding too much (periodic) devices (in nasty ways, no nice
> "not enough bandwidth" messages).

I can't speak to exactly how it works, because I am not familiar with
the code. I just know it fixes some things on the Raspberry Pi. And it
has been in the released Pi kernel for quite some time.

> Overall, I don't think this patch is not even nearly ready to be
> merged...

Well, I disagree :)

> > @@ -74,6 +74,9 @@ enum dwc2_lx_state {
> >   *                       0 - HNP and SRP capable (default)
> >   *                       1 - SRP Only capable
> >   *                       2 - No HNP/SRP capable
> > + * @otg_ver:            OTG version supported
> > + *                       0 - 1.3
> > + *                       1 - 2.0
> >   * @dma_enable:         Specifies whether to use slave or DMA mode for accessing
> >   *                      the data FIFOs. The driver will automatically detect the
> >   *                      value for this parameter if none is specified.
> > @@ -90,20 +93,10 @@ enum dwc2_lx_state {
> >   *                      the attached device and the value of phy_type.
> >   *                       0 - High Speed (default)
> >   *                       1 - Full Speed
> > - * @host_support_fs_ls_low_power: Specifies whether low power mode is supported
> > - *                      when attached to a Full Speed or Low Speed device in
> > - *                      host mode.
> > - *                       0 - Don't support low power mode (default)
> > - *                       1 - Support low power mode
> > - * @host_ls_low_power_phy_clk: Specifies the PHY clock rate in low power mode
> > - *                      when connected to a Low Speed device in host mode. This
> > - *                      parameter is applicable only if
> > - *                      host_support_fs_ls_low_power is enabled. If phy_type is
> > - *                      set to FS then defaults to 6 MHZ otherwise 48 MHZ.
> > - *                       0 - 48 MHz
> > - *                       1 - 6 MHz
> >   * @enable_dynamic_fifo: 0 - Use coreConsultant-specified FIFO size parameters
> >   *                       1 - Allow dynamic FIFO sizing (default)
> > + * @en_multiple_tx_fifo: Specifies whether dedicated per-endpoint transmit FIFOs
> > + *                      are enabled
> >   * @host_rx_fifo_size:  Number of 4-byte words in the Rx FIFO in host mode when
> >   *                      dynamic FIFO sizing is enabled
> >   *                       16 to 32768 (default 1024)
> > @@ -145,9 +138,19 @@ enum dwc2_lx_state {
> >   *                       0 - No (default)
> >   *                       1 - Yes
> >   * @ulpi_fs_ls:         True to make ULPI phy operate in FS/LS mode only
> > + * @host_support_fs_ls_low_power: Specifies whether low power mode is supported
> > + *                      when attached to a Full Speed or Low Speed device in
> > + *                      host mode.
> > + *                       0 - Don't support low power mode (default)
> > + *                       1 - Support low power mode
> > + * @host_ls_low_power_phy_clk: Specifies the PHY clock rate in low power mode
> > + *                      when connected to a Low Speed device in host mode. This
> > + *                      parameter is applicable only if
> > + *                      host_support_fs_ls_low_power is enabled. If phy_type is
> > + *                      set to FS then defaults to 6 MHZ otherwise 48 MHZ.
> > + *                       0 - 48 MHz
> > + *                       1 - 6 MHz
> >   * @ts_dline:           True to enable Term Select Dline pulsing
> > - * @en_multiple_tx_fifo: Specifies whether dedicated per-endpoint transmit FIFOs
> > - *                      are enabled
> >   * @reload_ctl:         True to allow dynamic reloading of HFIR register during
> >   *                      runtime
> >   * @ahbcfg:             This field allows the default value of the GAHBCFG
> > @@ -155,9 +158,7 @@ enum dwc2_lx_state {
> >   *                       -1         - GAHBCFG value will not be overridden
> >   *                       all others - GAHBCFG value will be overridden with
> >   *                                    this value
> > - * @otg_ver:            OTG version supported
> > - *                       0 - 1.3
> > - *                       1 - 2.0
> 
> The above changes look unrelated to the subject of this patch?

Yes, those are some cleanups to make the ordering of the kernel doc
match the ordering of the struct members. It makes it easier to follow
the code. I will make a separate patch for that.

> > + * @uframe_sched:       True to enable the microframe scheduler
> >   *
> >   * The following parameters may be specified when starting the module. These
> >   * parameters define how the DWC_otg controller should be configured.
> > @@ -191,6 +192,7 @@ struct dwc2_core_params {
> >  	int ts_dline;
> >  	int reload_ctl;
> >  	int ahbcfg;
> > +	int uframe_sched;
> >  };
> >
> >  /**
> > @@ -266,6 +268,7 @@ struct dwc2_core_params {
> >   *                      This value is in microseconds per (micro)frame. The
> >   *                      assumption is that all periodic transfers may occur in
> >   *                      the same (micro)frame.
> > + * @frame_usecs:        Internal variable used by the microframe scheduler
> This could use a real explanation, like:
>  * @frame_usecs:        Number of usecs per microframe that are still
>                         available to schedule periodic transfers in.
> >   * @frame_number:       Frame number read from the core at SOF. The value ranges
> >   *                      from 0 to HFNUM_MAX_FRNUM.
> >   * @periodic_qh_count:  Count of periodic QHs, if using several eps. Used for
> > @@ -278,6 +281,8 @@ struct dwc2_core_params {
> >   *                      host channel is available for non-periodic transactions.
> >   * @non_periodic_channels: Number of host channels assigned to non-periodic
> >   *                      transfers
> > + * @available_host_channels Number of host channels available for the microframe
> > + *                      scheduler to use
> This description looks weird. The scheduler doesn't use host channels,
> it uses frames and usecs. It seems that this really is the number of
> host channels not in use, but I wonder if one can't just check the
> number of channels in the free_hc_list (given the only checks done are <
> 1 and <= 1, checking against the list shouldn't be so hard?)

I don't know about this (not my code).

> >   * @hc_ptr_array:       Array of pointers to the host channel descriptors.
> >   *                      Allows accessing a host channel descriptor given the
> >   *                      host channel number. This is useful in interrupt
> > @@ -292,6 +297,9 @@ struct dwc2_core_params {
> >   * @otg_port:           OTG port number
> >   * @frame_list:         Frame list
> >   * @frame_list_dma:     Frame list DMA address
> > + * @next_sched_frame:   Internal variable used by the microframe scheduler
> > + * @np_count:           Internal variable used by the microframe scheduler
> > + * @np_sent:            Internal variable used by the microframe scheduler
> 
> The next_sched_frame, np_count and np_sent seem to be unused. There is a
> lot of code to set them and keep them correct, but they are never
> actually used for anything. I remember there was some other patch (or
> part of this patch?) which did something with higher priority interrupts
> on ARM, perhaps that's what this was for? Alternatively, I think these
> could be used to disable the SOF interrupt when nothing needs to be
> scheduled soon, but I think that should be a separate patch from this
> one?

next_sched_frame is used, see dwc2_hcd_qh_deactivate(). Looks like
you are right about np_count and np_sent, I will remove those two.

> > @@ -729,22 +735,36 @@ static void dwc2_assign_and_init_hc(struct dwc2_hsotg *hsotg,
> >
> >  	if (list_empty(&qh->qtd_list)) {
> >  		dev_dbg(hsotg->dev, "No QTDs in QH list\n");
> > -		return;
> > +		return -ENOMEM;
> >  	}
> >
> > -	if (list_empty(&hsotg->free_hc_list)) {
> > -		dev_dbg(hsotg->dev, "No free channel to assign\n");
> > -		return;
> > +	/* Find a free host channel with both CHENA and CHDIS clear */
> > +	list_for_each_safe(hc_item, hc_tmp, &hsotg->free_hc_list) {
> > +		chan = list_entry(hc_item, struct dwc2_host_chan,
> > +				  hc_list_entry);
> > +		hcchar = readl(hsotg->regs + HCCHAR(chan->hc_num));
> > +		if ((hcchar & (HCCHAR_CHENA | HCCHAR_CHDIS)) == 0)
> > +			break;
> > +		/* Move channel to end of free list */
> > +		list_del(&chan->hc_list_entry);
> > +		list_add_tail(&chan->hc_list_entry, &hsotg->free_hc_list);
> > +		chan = NULL;
> >  	}
> 
> I think this list could end up in an infinite loop if the free_hc_list
> contains 2 or more channels and all of those have CHENA or CHDIS set.
> Not sure if that is ever possible, but perhaps some kind of guard
> (saving the pointer to the first channel moved to the end of the list
> and breaking on that?) might be useful just in case?

Yes, good idea, I will do that.

> Could you expand on what it means for a channel to have CHENA or CHDIS
> set and why it could even happen for a channel that is in the free list?

I don't know, sorry.

> > -	chan = list_first_entry(&hsotg->free_hc_list, struct dwc2_host_chan,
> > -				hc_list_entry);
> > +	if (!chan) {
> > +		dev_dbg(hsotg->dev, "No free channel to assign\n");
> > +		return -ENOMEM;
> > +	}
> As above, I don't think this can ever happen.

Sure it can. If hsotg->free_hc_list is empty on entry to this
function, for example.

> > -	/* Remove the host channel from the free list */
> > +	/* Remove host channel from free list */
> This looks like a useless change, but ok :-)
> 
> >  	list_del_init(&chan->hc_list_entry);
> >
> >  	qtd = list_first_entry(&qh->qtd_list, struct dwc2_qtd, qtd_list_entry);
> >  	urb = qtd->urb;
> > +	if ((urb->actual_length < 0 || urb->actual_length > urb->length) &&
> > +	    !dwc2_hcd_is_pipe_in(&urb->pipe_info))
> > +		urb->actual_length = urb->length;
> > +
> Hmm, in what cases can this happen? Is this change really related to the
> scheduler?

Don't know.

> > @@ -266,21 +268,23 @@ struct dwc2_qh {
> >  	u8 data_toggle;
> >  	u8 ping_state;
> >  	u8 do_split;
> > -	struct list_head qtd_list;
> > -	struct dwc2_host_chan *channel;
> > +	u8 td_first;
> > +	u8 td_last;
> >  	u16 usecs;
> >  	u16 interval;
> >  	u16 sched_frame;
> > +	u16 nak_frame;
> > +	u16 frame_usecs[8];
> >  	u16 start_split_frame;
> > +	u16 ntd;
> >  	u8 *dw_align_buf;
> >  	dma_addr_t dw_align_buf_dma;
> > +	struct list_head qtd_list;
> > +	struct dwc2_host_chan *channel;
> >  	struct list_head qh_list_entry;
> >  	struct dwc2_hcd_dma_desc *desc_list;
> >  	dma_addr_t desc_list_dma;
> >  	u32 *n_bytes;
> > -	u16 ntd;
> > -	u8 td_first;
> > -	u8 td_last;
> >  	unsigned tt_buffer_dirty:1;
> >  };
> Only the addition of nak_frame and frame_usecs seem relevant to this
> patch, the rest seems noise. And those variables could use some actual
> documentation. I propose:
> 
>  * @nak_frame:          (Micro)frame number in which this qh was
>                         re-queued because a NAK was received. Is 0xffff
> 			when no NAK was received.
>  * @frame_usecs:        Internal variable used by the microframe scheduler

No, its not noise. I took the opportunity while adding the new
members to reorder the existing ones from smallest to largest, to
reduce the amount of wasted space due to structure padding. I think
that's a legitimate thing to do in this patch.

For the rest of your questions/comments, I don't think I have enough
knowledge of the code to answer.

Unless you find some actual problems when running with this patch,
I suggest we merge the rest of this as-is (modulo the things I
agreed to change above). As I said, this has been in the downstream
Pi kernel for quite some time, and it sure seems to improve things
on the Pi. And the driver is in the staging tree, after all.

Plus Gordon from raspberrypi.org has offered their help with
integrating the downstream driver with this one, so once that
begins they should be more willing to respond to questions on the
linux-usb list.

-- 
Paul

_______________________________________________
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxx
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel




[Index of Archives]     [Linux Driver Backports]     [DMA Engine]     [Linux GPIO]     [Linux SPI]     [Video for Linux]     [Linux USB Devel]     [Linux Coverity]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]
  Powered by Linux