[also added Jens Axboe] On 12.06.2018 10:27, Boris Brezillon wrote: > On Tue, 12 Jun 2018 10:06:42 +0200 > Stefan Agner <stefan@xxxxxxxx> wrote: > >> On 12.06.2018 02:03, Dmitry Osipenko wrote: >> > On Monday, 11 June 2018 23:52:22 MSK Stefan Agner wrote: >> >> Add support for the NAND flash controller found on NVIDIA >> >> Tegra 2 SoCs. This implementation does not make use of the >> >> command queue feature. Regular operations/data transfers are >> >> done in PIO mode. Page read/writes with hardware ECC make >> >> use of the DMA for data transfer. >> >> >> >> Signed-off-by: Lucas Stach <dev@xxxxxxxxxx> >> >> Signed-off-by: Stefan Agner <stefan@xxxxxxxx> >> >> --- >> >> MAINTAINERS | 7 + >> >> drivers/mtd/nand/raw/Kconfig | 6 + >> >> drivers/mtd/nand/raw/Makefile | 1 + >> >> drivers/mtd/nand/raw/tegra_nand.c | 1248 +++++++++++++++++++++++++++++ >> >> 4 files changed, 1262 insertions(+) >> >> create mode 100644 drivers/mtd/nand/raw/tegra_nand.c >> >> >> [snip] >> >> +static int tegra_nand_cmd(struct nand_chip *chip, >> >> + const struct nand_subop *subop) >> >> +{ >> >> + const struct nand_op_instr *instr; >> >> + const struct nand_op_instr *instr_data_in = NULL; >> >> + struct tegra_nand_controller *ctrl = to_tegra_ctrl(chip->controller); >> >> + unsigned int op_id, size = 0, offset = 0; >> >> + bool first_cmd = true; >> >> + u32 reg, cmd = 0; >> >> + int ret; >> >> + >> >> + for (op_id = 0; op_id < subop->ninstrs; op_id++) { >> >> + unsigned int naddrs, i; >> >> + const u8 *addrs; >> >> + u32 addr1 = 0, addr2 = 0; >> >> + >> >> + instr = &subop->instrs[op_id]; >> >> + >> >> + switch (instr->type) { >> >> + case NAND_OP_CMD_INSTR: >> >> + if (first_cmd) { >> >> + cmd |= COMMAND_CLE; >> >> + writel_relaxed(instr->ctx.cmd.opcode, >> >> + ctrl->regs + CMD_REG1); >> >> + } else { >> >> + cmd |= COMMAND_SEC_CMD; >> >> + writel_relaxed(instr->ctx.cmd.opcode, >> >> + ctrl->regs + CMD_REG2); >> >> + } >> >> + first_cmd = false; >> >> + break; >> >> + case NAND_OP_ADDR_INSTR: >> >> + offset = nand_subop_get_addr_start_off(subop, op_id); >> >> + naddrs = nand_subop_get_num_addr_cyc(subop, op_id); >> >> + addrs = &instr->ctx.addr.addrs[offset]; >> >> + >> >> + cmd |= COMMAND_ALE | COMMAND_ALE_SIZE(naddrs); >> >> + for (i = 0; i < min_t(unsigned int, 4, naddrs); i++) >> >> + addr1 |= *addrs++ << (BITS_PER_BYTE * i); >> >> + naddrs -= i; >> >> + for (i = 0; i < min_t(unsigned int, 4, naddrs); i++) >> >> + addr2 |= *addrs++ << (BITS_PER_BYTE * i); >> >> + writel_relaxed(addr1, ctrl->regs + ADDR_REG1); >> >> + writel_relaxed(addr2, ctrl->regs + ADDR_REG2); >> >> + break; >> >> + >> >> + case NAND_OP_DATA_IN_INSTR: >> >> + size = nand_subop_get_data_len(subop, op_id); >> >> + offset = nand_subop_get_data_start_off(subop, op_id); >> >> + >> >> + cmd |= COMMAND_TRANS_SIZE(size) | COMMAND_PIO | >> >> + COMMAND_RX | COMMAND_A_VALID; >> >> + >> >> + instr_data_in = instr; >> >> + break; >> >> + >> >> + case NAND_OP_DATA_OUT_INSTR: >> >> + size = nand_subop_get_data_len(subop, op_id); >> >> + offset = nand_subop_get_data_start_off(subop, op_id); >> >> + >> >> + cmd |= COMMAND_TRANS_SIZE(size) | COMMAND_PIO | >> >> + COMMAND_TX | COMMAND_A_VALID; >> >> + >> >> + memcpy(®, instr->ctx.data.buf.out + offset, size); >> >> + writel_relaxed(reg, ctrl->regs + RESP); >> >> + >> >> + break; >> >> + case NAND_OP_WAITRDY_INSTR: >> >> + cmd |= COMMAND_RBSY_CHK; >> >> + break; >> >> + >> >> + } >> >> + } >> >> + >> >> + cmd |= COMMAND_GO | COMMAND_CE(ctrl->cur_cs); >> >> + writel_relaxed(cmd, ctrl->regs + COMMAND); >> >> + ret = wait_for_completion_io_timeout(&ctrl->command_complete, >> >> + msecs_to_jiffies(500)); >> > >> > It's not obvious to me whether _io_ variant is appropriate to use here, would >> > be nice if somebody could clarify that. Maybe block/ already does the IO >> > accounting itself and hence the IO time would be counted twice in that case. >> >> Good that you bring this up. >> >> I don't think that there is any higher layer which could take care of >> accounting. Usually, with raw nand there is no block layer involved >> anyway. >> >> In a quick test it seems that only when using wait_for_completion_io I/O >> is properly accounted in the "wait" section of top. >> >> So far only a single driver (omap2) used the _io variant, but I think it >> is the right thing to do! After all, it is I/O... >> >> Boris or any other MTD maintainer, any comment on this? > > Given this definition of io_schedule_timeout() [1] (which is used when > you call wait_for_completion_io_timeout()), I'd say it's not useful to > use the _io_ version, simply because MTD devs are not exposed as blk > devices, and thus don't need the blk_schedule_flush_plug() that is done > is io_schedule_prepare(). But that also means MTD I/Os are not > accounted as I/Os :-(. Documentation of wait_for_completion_io says: "The caller is accounted as waiting for IO (which traditionally means blkio only)." Which sounds as if it using _io is only an accounting thing... The hint about blkio might suggest that there is more to it. Is calling blk_schedule_flush_plug a problem for MTD? https://elixir.bootlin.com/linux/v4.17.1/source/include/linux/blkdev.h#L1355 > > Let's go for the non-io version for now, since all drivers except omap2 > seem to use this function. > I still think it would be nice and "the right thing" from a user perspective... -- Stefan > [1]https://elixir.bootlin.com/linux/v4.17.1/source/kernel/sched/core.c#L5164 -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html