Hmm, I wonder if I/we wrongly assumed that the dma_dev used for the hw_max_sectors clamping in __scsi_init_queue() is wrong. So instead of adding a fallback else-clause here or using "sysdev" as dma_dev like in the current upstream code, maybe we should actually do a three-way min: the "changed" hw_max_sectors, dma_max_mapping_size of dma_dev("dev") and dma_max_mapping_size of sysdev...? On Mon, 30 Nov 2020 at 17:48, Hans de Goede <hdegoede@xxxxxxxxxx> wrote: > > Hi, > > On 11/28/20 4:48 PM, Tom Yan wrote: > > Apparently the former (with the chosen dma_dev) may cause problem in certain > > case (e.g. where thunderbolt dock and intel iommu are involved). The error > > observed was: > > > > XHCI swiotlb buffer is full / DMAR: Device bounce map failed > > > > For now we retain the clamp for hw_max_sectors against the dma_max_mapping_size. > > Since the device/size for the clamp that is applied when the scsi request queue > > is initialized/allocated is different than the one used here, we invalidate the > > early clamping by making a fallback blk_queue_max_hw_sectors() call. > > > > Signed-off-by: Tom Yan <tom.ty89@xxxxxxxxx> > > I can confirm that this fixes the network performance on a Lenovo Thunderbolt > dock generation 2, which uses an USB attach NIC. > > With this patch added on top of 5.10-rc5 scp performance to another machine > on the local gbit LAN goes back from the regressed 1 MB/s to its original 100MB/s > as it should be: > > Tested-by: Hans de Goede <hdegoede@xxxxxxxxxx> > > Regards, > > Hans > > > > --- > > drivers/usb/storage/uas.c | 11 +++++++---- > > 1 file changed, 7 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c > > index c8a577309e8f..5db1325cea20 100644 > > --- a/drivers/usb/storage/uas.c > > +++ b/drivers/usb/storage/uas.c > > @@ -843,18 +843,21 @@ static int uas_slave_alloc(struct scsi_device *sdev) > > static int uas_slave_configure(struct scsi_device *sdev) > > { > > struct uas_dev_info *devinfo = sdev->hostdata; > > - struct device *dev = sdev->host->dma_dev; > > + struct usb_device *udev = devinfo->udev; > > > > if (devinfo->flags & US_FL_MAX_SECTORS_64) > > blk_queue_max_hw_sectors(sdev->request_queue, 64); > > else if (devinfo->flags & US_FL_MAX_SECTORS_240) > > blk_queue_max_hw_sectors(sdev->request_queue, 240); > > - else if (devinfo->udev->speed >= USB_SPEED_SUPER) > > + else if (udev->speed >= USB_SPEED_SUPER) > > blk_queue_max_hw_sectors(sdev->request_queue, 2048); > > + else > > + blk_queue_max_hw_sectors(sdev->request_queue, > > + SCSI_DEFAULT_MAX_SECTORS); > > > > blk_queue_max_hw_sectors(sdev->request_queue, > > min_t(size_t, queue_max_hw_sectors(sdev->request_queue), > > - dma_max_mapping_size(dev) >> SECTOR_SHIFT)); > > + dma_max_mapping_size(udev->bus->sysdev) >> SECTOR_SHIFT)); > > > > if (devinfo->flags & US_FL_NO_REPORT_OPCODES) > > sdev->no_report_opcodes = 1; > > @@ -1040,7 +1043,7 @@ static int uas_probe(struct usb_interface *intf, const struct usb_device_id *id) > > shost->can_queue = devinfo->qdepth - 2; > > > > usb_set_intfdata(intf, shost); > > - result = scsi_add_host_with_dma(shost, &intf->dev, udev->bus->sysdev); > > + result = scsi_add_host(shost, &intf->dev); > > if (result) > > goto free_streams; > > > > >