On 05.11.2018 12:03, Ben Dooks wrote: > > > On 2018-11-03 12:24, Dmitry Osipenko wrote: >> On 31.10.2018 19:03, Ben Dooks wrote: >>> The tx_status callback does not report the state of the transfer >>> beyond complete segments. This causes problems with users such as >>> ALSA when applications want to know accurately how much data has >>> been moved. >>> >>> This patch addes a function tegra_dma_update_residual() to query >>> the hardware and modify the residual information accordinly. It >>> takes into account any hardware issues when trying to read the >>> state, such as delays between finishing a buffer and signalling >>> the interrupt. >>> >>> Signed-off-by: Ben Dooks <ben.dooks@xxxxxxxxxxxxxxx> >> >> Hello Ben, >> >> Thank you very much for the patch! It fixes "farting sound" for >> chromium-browser and applications that use chromium-engine (tested on >> Tegra20) because apparently it tries to use low latency for everything >> and audio buffer is constantly underflowing without more detailed >> DMA-progress reporting. See couple more comments below. >> >>> --- >>> drivers/dma/tegra20-apb-dma.c | 94 ++++++++++++++++++++++++++++++++--- >>> 1 file changed, 87 insertions(+), 7 deletions(-) >>> >>> diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c >>> index 4f7d1e576d03..3fa3a1ac4f57 100644 >>> --- a/drivers/dma/tegra20-apb-dma.c >>> +++ b/drivers/dma/tegra20-apb-dma.c >>> @@ -802,12 +802,96 @@ static int tegra_dma_terminate_all(struct dma_chan *dc) >>> return 0; >>> } >>> >>> +static unsigned int tegra_dma_update_residual(struct tegra_dma_channel *tdc, >>> + struct tegra_dma_sg_req *sg_req, >>> + struct tegra_dma_desc *dma_desc, >>> + unsigned int residual) >>> +{ >>> + unsigned long status = 0x0; >> >> There is no need to initialize "status" variable. > > ok, will check this. > >>> + unsigned long wcount; >>> + unsigned long ahbptr; >>> + unsigned long tmp = 0x0; >>> + unsigned int result; >>> + int retries = TEGRA_APBDMA_BURST_COMPLETE_TIME * 10; >>> + int done; >>> + >>> + /* if we're not the current request, then don't alter the residual */ >>> + if (sg_req != list_first_entry(&tdc->pending_sg_req, >>> + struct tegra_dma_sg_req, node)) { >>> + result = residual; >>> + ahbptr = 0xffffffff; >>> + goto done; >>> + } >>> + >>> + /* loop until we have a reliable result for residual */ >>> + do { >>> + ahbptr = tdc_read(tdc, TEGRA_APBDMA_CHAN_AHBPTR); >>> + status = tdc_read(tdc, TEGRA_APBDMA_CHAN_STATUS); >> >>> + tmp = tdc_read(tdc, 0x08); /* total count for debug */ >> >> Register 0x08 (DMA_BYTE_STA) doesn't present on Tegra20 and "tmp" >> isn't used anywhere in the code. Please remove it entirely. > > ok, fixed > >>> + >>> + /* check status, if channel isn't busy then skip */ >>> + if (!(status & TEGRA_APBDMA_STATUS_BUSY)) { >>> + result = residual; >>> + break; >>> + } >> >> If "BUSY" is unset, doesn't this mean that transaction could be >> completed already? I don't quite understand why you want to skip here. > > I can't remember what the reasoning behind this was, this work was > originally done over a year ago and I am not sure if I can find any > of the notes from this. > Looks like it should be safe to remove this hunk. Please consider the removal or provide reasoning in the comment to the code if you'll recall it. >>> + >>> + /* if we've got an interrupt pending on the channel, don't >>> + * try and deal with the residue as the hardware has likely >>> + * moved on to the next buffer. return all data moved. >>> + */ >>> + if (status & TEGRA_APBDMA_STATUS_ISE_EOC) { >>> + result = residual - sg_req->req_len; >>> + break; >>> + } >>> + >>> + if (tdc->tdma->chip_data->support_separate_wcount_reg) >>> + wcount = tdc_read(tdc, TEGRA_APBDMA_CHAN_WORD_TRANSFER); >>> + else >>> + wcount = status; >>> + >>> + /* If the request is at the full point, then there is a >>> + * chance that we have read the status register in the >>> + * middle of the hardware reloading the next buffer. >>> + * >>> + * The sequence seems to be at the end of the buffer, to >>> + * load the new word count before raising the EOC flag (or >>> + * changing the ping-pong flag which could have also been >>> + * used to determine a new buffer). This means there is a >> two whitespaces here-----| >> >>> + * small window where we cannot determine zero-done for the >>> + * current buffer, or moved to next buffer. >>> + * >> >>> + * If done shows 0, then retry the load, as it may hit the >>> + * above hardware race. We will either get a new value which >>> + * is from the first buffer, or we get an EOC (new buffer) >>> + * or both a new value and an EOC... >> >> I think we just need to wait 20usec after reading out "words count" >> and then re-check interrupt status, so transfer is done if interrupt >> is set and otherwise "words count" value is actual and reliable. > > At the moment I have no way of going back and re-testing this code > against the problem it originally fixed, so would rather not change > the algorithm in this. Please tell what was the problem. >> >>> + */ >>> + done = get_current_xferred_count(tdc, sg_req, wcount); >>> + if (done != 0) { >>> + result = residual - done; >>> + break; >>> + } >>> + >>> + ndelay(100); >> >> There is no ndelay() on ARM, hence your 20usec timeout is 200usec. >> Please use udelay(). > > I thought there was one based on the lpj calculations, I will check later. > >>> + } while (--retries > 0); >>> + >>> + if (retries <= 0) { >>> + dev_err(tdc2dev(tdc), "timeout waiting for dma load\n"); >>> + result = residual; >>> + } >>> + >>> +done: >> >> Please rename goto label as it duplicates local variable name. > > ok, fixed. > >>> + dev_dbg(tdc2dev(tdc), "residual: req %08lx, ahb@%08lx, wcount %08lx, done %d\n", >>> + sg_req->ch_regs.ahb_ptr, ahbptr, wcount, done); >> >> Whitespace just after tab not needed. >> >>> + >>> + return result; >>> +} >>> + >>> static enum dma_status tegra_dma_tx_status(struct dma_chan *dc, >>> dma_cookie_t cookie, struct dma_tx_state *txstate) >>> { >>> struct tegra_dma_channel *tdc = to_tegra_dma_chan(dc); >>> struct tegra_dma_desc *dma_desc; >>> - struct tegra_dma_sg_req *sg_req; >>> + struct tegra_dma_sg_req *sg_req = NULL; >>> enum dma_status ret; >>> unsigned long flags; >>> unsigned int residual; >>> @@ -843,6 +927,7 @@ static enum dma_status tegra_dma_tx_status(struct dma_chan *dc, >>> residual = dma_desc->bytes_requested - >>> (dma_desc->bytes_transferred % >>> dma_desc->bytes_requested); >>> + residual = tegra_dma_update_residual(tdc, sg_req, dma_desc, residual); Line over 80 characters. Please make sure that "scripts/checkpatch.pl" doesn't have valid complains about the patches. [snip]