On 2020-09-16 5:44 PM, Andy Shevchenko wrote: > On Tue, Sep 15, 2020 at 06:29:33PM +0200, Cezary Rojewski wrote: >> Implement dsp lifecycle functions such as core RESET and STALL, >> SRAM power control and LP clock selection. This also adds functions for >> handling transport over DW DMA controller. > > Some nit-picks below. FWIW, > Reviewed-by: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx> > ... >> + >> +int catpt_dmac_probe(struct catpt_dev *cdev) >> +{ >> + struct dw_dma_chip *dmac; >> + int ret; >> + >> + dmac = devm_kzalloc(cdev->dev, sizeof(*dmac), GFP_KERNEL); >> + if (!dmac) >> + return -ENOMEM; > >> + dmac->regs = cdev->lpe_ba + >> + cdev->spec->host_dma_offset[CATPT_DMA_DEVID]; > > One line? > It's exactly 81 after one-lining - that's why I kept it as is. >> + dmac->dev = cdev->dev; >> + dmac->irq = cdev->irq; >> + >> + ret = dma_coerce_mask_and_coherent(cdev->dev, DMA_BIT_MASK(31)); >> + if (ret) >> + return ret; >> + /* >> + * Caller is responsible for putting device in D0 to allow >> + * for I/O and memory access before probing DW. >> + */ >> + ret = dw_dma_probe(dmac); >> + if (ret) >> + return ret; >> + >> + cdev->dmac = dmac; >> + return 0; >> +} >> + >> +void catpt_dmac_remove(struct catpt_dev *cdev) >> +{ >> + /* >> + * As do_dma_remove() juggles with pm_runtime_get_xxx() and >> + * pm_runtime_put_xxx() while both ADSP and DW 'devices' are part of >> + * the same module, caller makes sure pm_runtime_disable() is invoked >> + * before removing DW to prevent postmortem resume and suspend. >> + */ >> + dw_dma_remove(cdev->dmac); >> +} >> + >> +static void catpt_dsp_set_srampge(struct catpt_dev *cdev, struct resource *sram, >> + unsigned long mask, unsigned long new) >> +{ >> + unsigned long old; >> + u32 off = sram->start; >> + u32 b = __ffs(mask); >> + >> + old = catpt_readl_pci(cdev, VDRTCTL0) & mask; > >> + dev_dbg(cdev->dev, "SRAMPGE [0x%08lx] 0x%08lx -> 0x%08lx", >> + mask, old, new); > > I saw use of trace points, this looks like non-production leftover. > In case of all dev_dbg(s) you had mentioned here, I've added these intentionally - this is similar to ASoC core logging: entire DAI and board connection process is logged via dev_dbg as there is no need to show these to user by default. Logging SRAMPGE/LPCS and block sanitization is useful for developers. I've given thought to traces on several occasions and ultimately decided to leave this only for IPC communication with ADSP. >> + if (old == new) >> + return; >> + >> + catpt_updatel_pci(cdev, VDRTCTL0, mask, new); >> + /* wait for SRAM power gating to propagate */ >> + udelay(60); >> + >> + /* >> + * Dummy read as the very first access after block enable >> + * to prevent byte loss in future operations. >> + */ >> + for_each_clear_bit_from(b, &new, fls_long(mask)) { >> + u8 buf[4]; >> + >> + /* newly enabled: new bit=0 while old bit=1 */ >> + if (test_bit(b, &old)) { > >> + dev_dbg(cdev->dev, "sanitize block %ld: off 0x%08x\n", >> + b - __ffs(mask), off); > > So does this. > Please see above. >> + memcpy_fromio(buf, cdev->lpe_ba + off, sizeof(buf)); >> + } >> + off += CATPT_MEMBLOCK_SIZE; >> + } >> +} >> + >> +void catpt_dsp_update_srampge(struct catpt_dev *cdev, struct resource *sram, >> + unsigned long mask) >> +{ >> + struct resource *res; >> + unsigned long new = 0; >> + >> + /* flag all busy blocks */ >> + for (res = sram->child; res; res = res->sibling) { >> + u32 h, l; >> + >> + h = (res->end - sram->start) / CATPT_MEMBLOCK_SIZE; >> + l = (res->start - sram->start) / CATPT_MEMBLOCK_SIZE; >> + new |= GENMASK(h, l); > > I think better assembly will be generated with > > (BIT(h - l + 1) - 1) << l > > Looking at the above calculus it seems (needs to be carefully checked!) can be > > u32 bits = DIV_ROUND_UP(resource_size(res), CATPT_MEMBLOCK_SIZE); > u32 shift = (res->start - sram->start) / CATPT_MEMBLOCK_SIZE; > > new |= (BIT(bits) - 1) << shift; > > Note, your approach is also good from readability point of view, so just weight > pros and cons and choose best one. > I'm up for performance change later on - I'm still yet to complete all the missing pieces for FW and SW flow documentation. Code as it is written now helps me during that process. Once docs are done, performance could overtake readability here and there. >> + } >> + >> + /* offset value given mask's start and invert it as ON=b0 */ >> + new = ~(new << __ffs(mask)) & mask; >> + >> + /* disable core clock gating */ >> + catpt_updatel_pci(cdev, VDRTCTL2, CATPT_VDRTCTL2_DCLCGE, 0); >> + >> + catpt_dsp_set_srampge(cdev, sram, mask, new); >> + >> + /* enable core clock gating */ >> + catpt_updatel_pci(cdev, VDRTCTL2, CATPT_VDRTCTL2_DCLCGE, >> + CATPT_VDRTCTL2_DCLCGE); >> +} >> + >> +int catpt_dsp_stall(struct catpt_dev *cdev, bool stall) >> +{ >> + u32 reg, val; >> + >> + val = stall ? CATPT_CS_STALL : 0; >> + catpt_updatel_shim(cdev, CS1, CATPT_CS_STALL, val); >> + >> + return catpt_readl_poll_shim(cdev, CS1, >> + reg, (reg & CATPT_CS_STALL) == val, >> + 500, 10000); >> +} >> + >> +static int catpt_dsp_reset(struct catpt_dev *cdev, bool reset) >> +{ >> + u32 reg, val; >> + >> + val = reset ? CATPT_CS_RST : 0; >> + catpt_updatel_shim(cdev, CS1, CATPT_CS_RST, val); >> + >> + return catpt_readl_poll_shim(cdev, CS1, >> + reg, (reg & CATPT_CS_RST) == val, >> + 500, 10000); >> +} >> + >> +void lpt_dsp_pll_shutdown(struct catpt_dev *cdev, bool enable) >> +{ >> + u32 val; >> + >> + val = enable ? LPT_VDRTCTL0_APLLSE : 0; >> + catpt_updatel_pci(cdev, VDRTCTL0, LPT_VDRTCTL0_APLLSE, val); >> +} >> + >> +void wpt_dsp_pll_shutdown(struct catpt_dev *cdev, bool enable) >> +{ >> + u32 val; >> + >> + val = enable ? WPT_VDRTCTL2_APLLSE : 0; >> + catpt_updatel_pci(cdev, VDRTCTL2, WPT_VDRTCTL2_APLLSE, val); >> +} >> + >> +static int catpt_dsp_select_lpclock(struct catpt_dev *cdev, bool lp, bool waiti) >> +{ >> + u32 mask, reg, val; >> + int ret; >> + >> + mutex_lock(&cdev->clk_mutex); >> + >> + val = lp ? CATPT_CS_LPCS : 0; >> + reg = catpt_readl_shim(cdev, CS1) & CATPT_CS_LPCS; > >> + dev_dbg(cdev->dev, "LPCS [0x%08lx] 0x%08x -> 0x%08x", >> + CATPT_CS_LPCS, reg, val); > > Leftover? > Please see the above (dev_dbg chapter).