Hi Vinod, On 28/01/2020 14.37, Peter Ujfalusi wrote: > Hi Vinod, > > On 28/01/2020 13.50, Vinod Koul wrote: >> On 28-01-20, 12:15, Peter Ujfalusi wrote: >>> Vinod, >>> >>> On 27/01/2020 15.21, Peter Ujfalusi wrote: >>>> Hi Vinod, >>>> >>>> Based on customer reports we have identified two issues with the UDMA driver: >>>> >>>> TX completion (1st patch): >>>> The scheduled work based workaround for checking for completion worked well for >>>> UART, but it had significant impact on SPI performance. >>>> The underlying issue is coming from the fact that we have split data movement >>>> architecture. >>>> In order to know that the transfer is really done we need to check the remote >>>> end's (PDMA) byte counter. >>>> >>>> RX channel teardown with stale data in PDMA (2nd patch): >>>> If we try to stop the RX DMA channel (teardown) then PDMA is trying to flush the >>>> data is might received from a peripheral, but if UDMA does not have a packet to >>>> use for this draining than it is going to push back on the PDMA and the flush >>>> will never completes. >>>> The workaround is to use a dummy descriptor for flush purposes when the channel >>>> is terminated and we did not have active transfer (no descriptor for UDMA). >>>> This allows UDMA to drain the data and the teardown can complete. >>>> >>>> The last two patch is to use common code to set up the TR parameters for >>>> slave_sg, cyclic and memcpy. The setup code is the same as we used for memcpy >>>> with the change we can handle 4.2GB sg elements and periods in case of cyclic. >>>> It is also nice that we have single function to do the configuration. >>> >>> I have marked these patches as for-next as 5.5 was not released yet. >>> Would it be possible to have these as fixes for 5.6? >> >> Sure but are they really fixes, why cant they go for next release :) >> >> They seem to improve things for sure, but do we want to call them as >> fixes..? > > I would say that the first two patch is a fix: > TX completion check is fixing the performance hit by the early TX > completion workaround which used jiffies+work. > > The second patch is fixing a case when we have stale data during RX and > no active transfer. For example when UART reads 1000 bytes, but the > other end is 'streaming' the data and after the 1000 bytes the UART+PDMA > receives data. > Recovering from this state is not easy and it might not even succeed in > HW level. > > The last two is I agree, it is not fixing much, it does corrects the > slave_sg TR setup (and improves the cyclic as well). > With that I could send the ASoC platform wrapper for UDMA with > period_bytes_max = 4.2GB ;) > I have SZ_512K in there atm, with the old calculation SZ_64K is the > maximum, not a big issue. Actually this also fixes a real bug in the driver for the slave_sg_tr case: if the sg_dma_len(sgent) is not multiple of (burst * dev_width) then we end up with missing bits as the counters are not set up correctly. The client driver which we tested the slave_sg_tr was always giving sg_len == 1 and the buffer was aligned, but when I tuned the client to pass a list, things got broken. > > I think the first two patch is a fix candidate as they fix regression > (albeit regression between the series's) and a real world channel lockup > discovered too late for the initial driver. > > - Péter > > Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. > Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki > - Péter Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki