>>>>>> Since ~4.20, when the functionfs gadget enabled scatter-gather >>>>>> support, we have seen problems with adb connections stalling and >>>>>> stopping to function on hardware with dwc3 usb controllers. >>>>>> Specifically, HiKey960, Dragonboard 845c, and Pixel3 devices. >>>>> >>>>> Any chance this: >>>>> >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git/commi >>>>> t/ >>>>> ?h=testing/next&id=f63333e8e4fd63d8d8ae83b89d2c38cf21d64801 >>>> This is a different issue. I have tried initializing num_sgs when debugging this adb stall problem, but it didn't help. >>> >>> So multiple folks have run through this problem, but not *one* has tracepoints collected from the issue? C'mon guys. >>> Can someone, please, collect tracepoints so we can figure out what's actually going on? >>> >>> I'm pretty sure this should be solved at the DMA API level, just want to confirm. >> >> I have sent you the tracepoints long time ago. Also my analysis of the >> problem (BTW, I don't think the tracepoints helped much). It's >> basically a logic problem in function dwc3_gadget_ep_reclaim_trb_sg(). > > AFAICT, this is caused by DMA API merging pages together when map an sglist for DMA. While doing that, > it does *not* move the SG_END flag which sg_is_last() checks. > > I consider that an overlook on the DMA API, wouldn't you? Why should DMA API users care if pages were merged or not while mapping the sglist? > We have for_each_sg() and sg_is_last() for a reason. Oops, my bad. Actually, I was talking about the other patch, not the one setting num_sgs = 0; I don't know if this patch is really needed, but from what I remember the DMA API is setting up the num_sgs properly. I agree even if there is a problem initializing num_sgs, it should be fixed in DMA API. > I can try dig into my old emails and resend, but that is a bit hard to find. > > Don't bother, I'm still not convinced we should fix at the driver level when sg_is_last() should be working here, > unless we should iterate over num_sgs instead of num_mapped_sgs, though I don't think that's the case since > in that case we would have to chain buffers of size zero. > -- > balbi