On 13.02.2018 14:24, Thierry Reding wrote: > On Mon, Feb 12, 2018 at 08:06:31PM +0300, Dmitry Osipenko wrote: >> In order to reset busy HW properly, memory controller needs to be >> involved, otherwise it possible to get corrupted memory if HW was reset >> during DMA. Introduce memory client 'hot reset' API that will be used >> for resetting busy HW. The primary users are memory clients related to >> video (decoder/encoder/camera) and graphics (2d/3d). >> >> Signed-off-by: Dmitry Osipenko <digetx@xxxxxxxxx> >> --- >> drivers/memory/tegra/mc.c | 249 ++++++++++++++++++++++++++++++++++++++++ >> drivers/memory/tegra/tegra114.c | 25 ++++ >> drivers/memory/tegra/tegra124.c | 32 ++++++ >> drivers/memory/tegra/tegra20.c | 23 ++++ >> drivers/memory/tegra/tegra210.c | 27 +++++ >> drivers/memory/tegra/tegra30.c | 25 ++++ >> include/soc/tegra/mc.h | 77 +++++++++++++ >> 7 files changed, 458 insertions(+) > > As discussed on IRC, I typed up a variant of this in an attempt to fix > an unrelated bug report. The code is here: > > https://github.com/thierryreding/linux/commit/e104e703cb75dfe742b2e051194a0af8ddefcd03 > > I think we can make this without adding a custom API. The reset control > API should work just fine. The above version doesn't take into account > some of Tegra20's quirks, but I think it should still work for Tegra20 > with just slightly different implementations for ->assert() and > ->deassert(). > > A couple of more specific comments below. > >> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c >> index 187a9005351b..9838f588d64d 100644 >> --- a/drivers/memory/tegra/mc.c >> +++ b/drivers/memory/tegra/mc.c >> @@ -7,11 +7,13 @@ >> */ >> >> #include <linux/clk.h> >> +#include <linux/delay.h> >> #include <linux/interrupt.h> >> #include <linux/kernel.h> >> #include <linux/module.h> >> #include <linux/of.h> >> #include <linux/platform_device.h> >> +#include <linux/reset.h> >> #include <linux/slab.h> >> #include <linux/sort.h> >> >> @@ -81,6 +83,172 @@ static const struct of_device_id tegra_mc_of_match[] = { >> }; >> MODULE_DEVICE_TABLE(of, tegra_mc_of_match); >> >> +static int terga_mc_flush_dma(struct tegra_mc *mc, unsigned int id) >> +{ >> + unsigned int hw_id = mc->soc->modules[id].hw_id; >> + u32 value, reg_poll = mc->soc->reg_client_flush_status; >> + int retries = 3; >> + >> + value = mc_readl(mc, mc->soc->reg_client_ctrl); >> + >> + if (mc->soc->tegra20) >> + value &= ~BIT(hw_id); >> + else >> + value |= BIT(hw_id); >> + >> + /* block clients DMA requests */ >> + mc_writel(mc, value, mc->soc->reg_client_ctrl); >> + >> + /* wait for completion of the outstanding DMA requests */ >> + if (mc->soc->tegra20) { >> + while (mc_readl(mc, reg_poll + hw_id * sizeof(u32)) != 0) { >> + if (!retries--) >> + return -EBUSY; >> + >> + usleep_range(1000, 2000); >> + } >> + } else { >> + while ((mc_readl(mc, reg_poll) & BIT(hw_id)) == 0) { >> + if (!retries--) >> + return -EBUSY; >> + >> + usleep_range(1000, 2000); >> + } >> + } >> + >> + return 0; >> +} > > I think this suffers from too much unification. The programming model is > too different to stash this into a single function implementation and as > a result this becomes very difficult to read. In my experience it's more > readable to split this into separate implementations and pass around > pointers to them. > >> + >> +static int terga_mc_unblock_dma(struct tegra_mc *mc, unsigned int id) >> +{ >> + unsigned int hw_id = mc->soc->modules[id].hw_id; >> + u32 value; >> + >> + value = mc_readl(mc, mc->soc->reg_client_ctrl); >> + >> + if (mc->soc->tegra20) >> + value |= BIT(hw_id); >> + else >> + value &= ~BIT(hw_id); >> + >> + mc_writel(mc, value, mc->soc->reg_client_ctrl); >> + >> + return 0; >> +} >> + >> +static int terga_mc_hotreset_assert(struct tegra_mc *mc, unsigned int id) >> +{ >> + unsigned int hw_id = mc->soc->modules[id].hw_id; >> + u32 value; >> + >> + if (mc->soc->tegra20) { >> + value = mc_readl(mc, mc->soc->reg_client_hotresetn); >> + >> + mc_writel(mc, value & ~BIT(hw_id), >> + mc->soc->reg_client_hotresetn); >> + } >> + >> + return 0; >> +} >> + >> +static int terga_mc_hotreset_deassert(struct tegra_mc *mc, unsigned int id) >> +{ >> + unsigned int hw_id = mc->soc->modules[id].hw_id; >> + u32 value; >> + >> + if (mc->soc->tegra20) { >> + value = mc_readl(mc, mc->soc->reg_client_hotresetn); >> + >> + mc_writel(mc, value | BIT(hw_id), >> + mc->soc->reg_client_hotresetn); >> + } >> + >> + return 0; >> +} > > The same goes for these. I think we can do this much more easily by > providing reset controller API ->assert() and ->deassert() > implementations for Tegra20 and Tegra30+, and then register the reset > controller device using the ops stored in the MC SoC structure. > >> +static int tegra_mc_hot_reset_assert(struct tegra_mc *mc, unsigned int id, >> + struct reset_control *rst) >> +{ >> + int err; >> + >> + /* >> + * Block clients DMA requests and wait for completion of the >> + * outstanding requests. >> + */ >> + err = terga_mc_flush_dma(mc, id); >> + if (err) { >> + dev_err(mc->dev, "Failed to flush DMA: %d\n", err); >> + return err; >> + } >> + >> + /* put in reset HW that corresponds to the memory client */ >> + err = reset_control_assert(rst); >> + if (err) { >> + dev_err(mc->dev, "Failed to assert HW reset: %d\n", err); >> + return err; >> + } >> + >> + /* clear the client requests sitting before arbitration */ >> + err = terga_mc_hotreset_assert(mc, id); >> + if (err) { >> + dev_err(mc->dev, "Failed to hot reset client: %d\n", err); >> + return err; >> + } >> + >> + return 0; >> +} > > This is very strictly according to the TRM, but I don't see a reason why > you couldn't stash the DMA blocking and the hot reset into a ->assert() > implementation... > >> + >> +static int tegra_mc_hot_reset_deassert(struct tegra_mc *mc, unsigned int id, >> + struct reset_control *rst) >> +{ >> + int err; >> + >> + /* take out client from hot reset */ >> + err = terga_mc_hotreset_deassert(mc, id); >> + if (err) { >> + dev_err(mc->dev, "Failed to deassert hot reset: %d\n", err); >> + return err; >> + } >> + >> + /* take out from reset corresponding clients HW */ >> + err = reset_control_deassert(rst); >> + if (err) { >> + dev_err(mc->dev, "Failed to deassert HW reset: %d\n", err); >> + return err; >> + } >> + >> + /* allow new DMA requests to proceed to arbitration */ >> + err = terga_mc_unblock_dma(mc, id); >> + if (err) { >> + dev_err(mc->dev, "Failed to unblock client: %d\n", err); >> + return err; >> + } >> + >> + return 0; >> +} > > ... and the hot reset deassertion and the DMA unblocking into a > ->deassert() implementation. > > I think the important part is that DMA is blocked and the requests are > cleared before the module reset, and similarily that the hot reset is > released and DMA is unblocked after the module reset. Okay, let's try the way you're suggesting and see if anything breaks.. >> + >> +static int tegra_mc_hot_reset(struct tegra_mc *mc, unsigned int id, >> + struct reset_control *rst, unsigned long usecs) >> +{ >> + int err; >> + >> + err = tegra_mc_hot_reset_assert(mc, id, rst); >> + if (err) >> + return err; >> + >> + /* make sure that reset is propagated */ >> + if (usecs < 15) >> + udelay(usecs); >> + else >> + usleep_range(usecs, usecs + 500); >> + >> + err = tegra_mc_hot_reset_deassert(mc, id, rst); >> + if (err) >> + return err; >> + >> + return 0; >> +} > > Do you really need this helper? It seems like a marginal gain in terms > of boilerplate while obviously some (or maybe even most?) drivers can't > use this because they need more explicit control over the sequence. > > The only case where I could see this be useful is during some error > recovery mechanism, in which case perhaps a runtime suspend/resume might > be more useful. > >> + >> static int tegra_mc_setup_latency_allowance(struct tegra_mc *mc) >> { >> unsigned long long tick; >> @@ -416,6 +584,7 @@ static int tegra_mc_probe(struct platform_device *pdev) >> return -ENOMEM; >> >> platform_set_drvdata(pdev, mc); >> + mutex_init(&mc->lock); >> mc->soc = match->data; >> mc->dev = &pdev->dev; >> >> @@ -499,6 +668,86 @@ static struct platform_driver tegra_mc_driver = { >> .probe = tegra_mc_probe, >> }; >> >> +static int tegra_mc_match(struct device *dev, void *data) >> +{ >> + return of_match_node(tegra_mc_of_match, dev->of_node) != NULL; >> +} >> + >> +static struct tegra_mc *tegra_mc_find_device(void) >> +{ >> + struct device *dev; >> + >> + dev = driver_find_device(&tegra_mc_driver.driver, NULL, NULL, >> + tegra_mc_match); >> + if (!dev) >> + return NULL; >> + >> + return dev_get_drvdata(dev); >> +} > > Another benefit of the reset controller API is that you don't have to > look up the MC device like this. Instead you can just upcast the pointer > to the reset controller device. > >> + >> +int tegra_memory_client_hot_reset(unsigned int id, struct reset_control *rst, >> + unsigned long usecs) >> +{ >> + struct tegra_mc *mc; >> + int ret; >> + >> + mc = tegra_mc_find_device(); >> + if (!mc) >> + return -ENODEV; >> + >> + if (id >= mc->soc->num_modules || !mc->soc->modules[id].valid) >> + return -EINVAL; > > One of the problems with sparse arrays is that you need to explicitly > mark each of the entries as valid. This is error prone and tedious in > my opinion. > >> + >> + mutex_lock(&mc->lock); >> + ret = tegra_mc_hot_reset(mc, id, rst, usecs); >> + mutex_unlock(&mc->lock); >> + >> + return ret; >> +} >> +EXPORT_SYMBOL_GPL(tegra_memory_client_hot_reset); >> + >> +int tegra_memory_client_hot_reset_assert(unsigned int id, >> + struct reset_control *rst) >> +{ >> + struct tegra_mc *mc; >> + int ret; >> + >> + mc = tegra_mc_find_device(); >> + if (!mc) >> + return -ENODEV; >> + >> + if (id >= mc->soc->num_modules || !mc->soc->modules[id].valid) >> + return -EINVAL; >> + >> + mutex_lock(&mc->lock); >> + ret = tegra_mc_hot_reset_assert(mc, id, rst); >> + mutex_unlock(&mc->lock); >> + >> + return ret; >> +} >> +EXPORT_SYMBOL_GPL(tegra_memory_client_hot_reset_assert); >> + >> +int tegra_memory_client_hot_reset_deassert(unsigned int id, >> + struct reset_control *rst) >> +{ >> + struct tegra_mc *mc; >> + int ret; >> + >> + mc = tegra_mc_find_device(); >> + if (!mc) >> + return -ENODEV; >> + >> + if (id >= mc->soc->num_modules || !mc->soc->modules[id].valid) >> + return -EINVAL; > > There's a lot of repetition in the code here. If you look at my > prototype, I think this is simpler to deal with if you get a reference > to the reset and then just use it. All of the special case handling is > done in the lookup function, and then you get back NULL or a valid > pointer that you can immediately use. > >> + >> + mutex_lock(&mc->lock); >> + ret = tegra_mc_hot_reset_deassert(mc, id, rst); >> + mutex_unlock(&mc->lock); >> + >> + return ret; >> +} >> +EXPORT_SYMBOL_GPL(tegra_memory_client_hot_reset_deassert); >> + >> static int tegra_mc_init(void) >> { >> return platform_driver_register(&tegra_mc_driver); > [...] >> diff --git a/drivers/memory/tegra/tegra124.c b/drivers/memory/tegra/tegra124.c >> index 8b6360eabb8a..135012c74358 100644 >> --- a/drivers/memory/tegra/tegra124.c >> +++ b/drivers/memory/tegra/tegra124.c >> @@ -1012,6 +1012,30 @@ static const struct tegra_smmu_group_soc tegra124_groups[] = { >> }, >> }; >> >> +static const struct tegra_mc_module tegra124_mc_modules[] = { >> + [TEGRA_MEMORY_CLIENT_AFI] = { .hw_id = 0, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_AVP] = { .hw_id = 1, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_DC] = { .hw_id = 2, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_DCB] = { .hw_id = 3, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_HOST1X] = { .hw_id = 6, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_HDA] = { .hw_id = 7, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_ISP2] = { .hw_id = 8, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_MPCORE] = { .hw_id = 9, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_MPCORELP] = { .hw_id = 10, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_MSENC] = { .hw_id = 11, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_PPCS] = { .hw_id = 14, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_SATA] = { .hw_id = 15, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_VDE] = { .hw_id = 16, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_VI] = { .hw_id = 17, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_VIC] = { .hw_id = 18, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_XUSB_HOST] = { .hw_id = 19, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_XUSB_DEV] = { .hw_id = 20, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_TSEC] = { .hw_id = 22, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_SDMMC1] = { .hw_id = 29, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_SDMMC2] = { .hw_id = 30, .valid = true }, >> + [TEGRA_MEMORY_CLIENT_SDMMC3] = { .hw_id = 31, .valid = true }, >> +}; > > This list is incomplete. The same is true for any later generation. > There are also quite a few holes in them. I think a better use of this > space is to make the array compact and instead have more explicit > information in the array. > >> diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h >> index 6cfc1dfa3a40..2d36db3ac659 100644 >> --- a/include/soc/tegra/mc.h >> +++ b/include/soc/tegra/mc.h >> @@ -9,11 +9,13 @@ >> #ifndef __SOC_TEGRA_MC_H__ >> #define __SOC_TEGRA_MC_H__ >> >> +#include <linux/mutex.h> >> #include <linux/types.h> >> >> struct clk; >> struct device; >> struct page; >> +struct reset_control; >> >> struct tegra_smmu_enable { >> unsigned int reg; >> @@ -95,6 +97,11 @@ static inline void tegra_smmu_remove(struct tegra_smmu *smmu) >> } >> #endif >> >> +struct tegra_mc_module { >> + unsigned int hw_id; >> + bool valid; >> +}; >> + >> struct tegra_mc_soc { >> const struct tegra_mc_client *clients; >> unsigned int num_clients; >> @@ -110,6 +117,13 @@ struct tegra_mc_soc { >> const struct tegra_smmu_soc *smmu; >> >> bool tegra20; >> + >> + const struct tegra_mc_module *modules; >> + unsigned int num_modules; >> + >> + u32 reg_client_ctrl; >> + u32 reg_client_hotresetn; >> + u32 reg_client_flush_status; > > That's not enough to cover all clients on Tegra124 and later. We need at > least two registers. I'd also prefer to move away from the assumption > that the ID is somehow linked to the bit position. The ID is in fact > completely arbitrary and in this case only chosen to match the bit > position. > > This has multiple disadvantages, some of which I've already listed. One > of them is that we inevitably make arrays sparse as the SoC evolves, so > we need to work around this in code and data tables using a valid flag. > Checking for validity also becomes non-trivial and we move part of that > burden into drivers. > > I think a much better and robust solution is to completely separate the > ID from the hardware registers. This is implemented in my prototype on > github. The ID is essentially only used as a way to identify the reset > via device tree. The MC driver will use the ID to look up the reset in > its per-SoC table and will only work with the reset object after that. > The object itself contains all the information needed to program the > registers. > > Currently the tables in that implementation don't take into account the > per-client "outstanding requests" registers, but they could be easily > extended to do so. > >> }; >> >> struct tegra_mc { >> @@ -124,9 +138,72 @@ struct tegra_mc { >> >> struct tegra_mc_timing *timings; >> unsigned int num_timings; >> + >> + struct mutex lock; >> }; >> >> void tegra_mc_write_emem_configuration(struct tegra_mc *mc, unsigned long rate); >> unsigned int tegra_mc_get_emem_device_count(struct tegra_mc *mc); >> >> +#define TEGRA_MEMORY_CLIENT_AVP 0 >> +#define TEGRA_MEMORY_CLIENT_DC 1 >> +#define TEGRA_MEMORY_CLIENT_DCB 2 >> +#define TEGRA_MEMORY_CLIENT_EPP 3 >> +#define TEGRA_MEMORY_CLIENT_2D 4 >> +#define TEGRA_MEMORY_CLIENT_HOST1X 5 >> +#define TEGRA_MEMORY_CLIENT_ISP 6 >> +#define TEGRA_MEMORY_CLIENT_MPCORE 7 >> +#define TEGRA_MEMORY_CLIENT_MPCORELP 8 >> +#define TEGRA_MEMORY_CLIENT_MPEA 9 >> +#define TEGRA_MEMORY_CLIENT_MPEB 10 >> +#define TEGRA_MEMORY_CLIENT_MPEC 11 >> +#define TEGRA_MEMORY_CLIENT_3D 12 >> +#define TEGRA_MEMORY_CLIENT_3D1 13 >> +#define TEGRA_MEMORY_CLIENT_PPCS 14 >> +#define TEGRA_MEMORY_CLIENT_VDE 15 >> +#define TEGRA_MEMORY_CLIENT_VI 16 >> +#define TEGRA_MEMORY_CLIENT_AFI 17 >> +#define TEGRA_MEMORY_CLIENT_HDA 18 >> +#define TEGRA_MEMORY_CLIENT_SATA 19 >> +#define TEGRA_MEMORY_CLIENT_MSENC 20 >> +#define TEGRA_MEMORY_CLIENT_VIC 21 >> +#define TEGRA_MEMORY_CLIENT_XUSB_HOST 22 >> +#define TEGRA_MEMORY_CLIENT_XUSB_DEV 23 >> +#define TEGRA_MEMORY_CLIENT_TSEC 24 >> +#define TEGRA_MEMORY_CLIENT_SDMMC1 25 >> +#define TEGRA_MEMORY_CLIENT_SDMMC2 26 >> +#define TEGRA_MEMORY_CLIENT_SDMMC3 27 >> +#define TEGRA_MEMORY_CLIENT_MAX TEGRA_MEMORY_CLIENT_SDMMC3 >> + >> +#define TEGRA_MEMORY_CLIENT_3D0 TEGRA_MEMORY_CLIENT_3D >> +#define TEGRA_MEMORY_CLIENT_MPE TEGRA_MEMORY_CLIENT_MPEA >> +#define TEGRA_MEMORY_CLIENT_NVENC TEGRA_MEMORY_CLIENT_MSENC >> +#define TEGRA_MEMORY_CLIENT_ISP2 TEGRA_MEMORY_CLIENT_ISP >> + >> +#ifdef CONFIG_ARCH_TEGRA >> +int tegra_memory_client_hot_reset(unsigned int id, struct reset_control *rst, >> + unsigned long usecs); >> +int tegra_memory_client_hot_reset_assert(unsigned int id, >> + struct reset_control *rst); >> +int tegra_memory_client_hot_reset_deassert(unsigned int id, >> + struct reset_control *rst); >> +#else > > I *really* don't want yet another custom API. We've had a number of > these in the past and it's always been extremely painful to get rid of > them. And we probably won't ever be able to completely get rid of the > powergate API, which means additional headaches everytime we need to > touch code in that area. I don't want to repeat that mistake. -- To unsubscribe from this list: send the line "unsubscribe linux-tegra" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html