On 26/09/17 00:15, Shawn N wrote: > On Wed, Sep 20, 2017 at 1:22 PM, Shawn N <shawnn@xxxxxxxxxx> wrote: >> On Tue, Sep 19, 2017 at 11:13 PM, Brian Norris <briannorris@xxxxxxxxxxxx> wrote: >>> Hi, >>> >>> On Tue, Sep 19, 2017 at 11:05:38PM -0700, Shawn N wrote: >>>> This is failing because our EC_CMD_GET_PROTOCOL_INFO host command is >>>> getting messed up, or the reply buffer is getting corrupted somehow. >>>> >>>> ec_dev->proto_version = >>>> min(EC_HOST_REQUEST_VERSION, >>>> fls(proto_info->protocol_versions) - 1); >>>> >> >> Checking this closer, the first host command we send after we boot the >> kernel (EC_CMD_GET_PROTOCOL_INFO) is failing due to protocol error >> (see 'SPI rx bad data' / 'SPI not ready' on the EC console). Since >> this doesn't seem to happen on the Chromium OS nyan_big release >> kernel, I suggest to hook up a logic analyzer and see if the SPI >> master is doing something bad. >> >> The error handling in cros_ec_cmd_xfer_spi() is completely wrong and >> we return -EAGAIN / EC_RES_IN_PROGRESS, which the caller interprets >> "the host command was received by the EC and is currently being >> handled, poll status until completion". So the caller polls status >> with EC_CMD_GET_COMMS_STATUS, sees no host command is in progress >> (which is interpreted to mean "the host command I sent previously has >> now successfully completed"), and returns success. The problem here is >> that the initial host command was never received at all, and no reply >> was ever received, so our reply data is all zero. >> >> Two things need to be fixed here: >> >> 1) Find out why the first host command after boot is failing. Probe >> SPI pins and see what's going on. Yes, I will see if I can look into this. >> 2) Fix error handling so we properly return an error (or properly >> retry the entire command) when a protocol error occurs (I made some >> attempt in https://chromium-review.googlesource.com/385080/, probably >> I should revisit that). > > The below patch will fix error handling and will make things mostly > work on nyan_big, because we'll fall back to V2 protocol after the > initial failure. But we should still investigate why we're getting > errors on the first host command. We aren't seeing these errors when > we send commands from firmware, so I suspect something is wrong in > kernel SPI HW initialization that causes the first command to fail. > > From: Shawn Nematbakhsh <shawnn@xxxxxxxxxxxx> > Date: Mon, 25 Sep 2017 14:32:38 -0700 > Subject: [PATCH] mfd: cros ec: spi: Fix "in progress" error signaling > > For host commands that take a long time to process, cros ec can return > early by signaling a EC_RES_IN_PROGRESS result. The host must then poll > status with EC_CMD_GET_COMMS_STATUS until completion of the command. > > None of the above applies when data link errors are encountered. When > errors such as EC_SPI_PAST_END are encountered during command > transmission, it usually means the command was not received by the EC. > Treating such errors as if they were 'EC_RES_IN_PROGRESS' results is > almost always the wrong decision, and can result in host commands > silently being lost. > > Signed-off-by: Shawn Nematbakhsh <shawnn@xxxxxxxxxxxx> > --- > drivers/mfd/cros_ec_spi.c | 26 ++++++++++++-------------- > 1 file changed, 12 insertions(+), 14 deletions(-) > > diff --git a/drivers/mfd/cros_ec_spi.c b/drivers/mfd/cros_ec_spi.c > index c9714072e224..d33e3847e11e 100644 > --- a/drivers/mfd/cros_ec_spi.c > +++ b/drivers/mfd/cros_ec_spi.c > @@ -377,6 +377,7 @@ static int cros_ec_pkt_xfer_spi(struct > cros_ec_device *ec_dev, > u8 *ptr; > u8 *rx_buf; > u8 sum; > + u8 rx_byte; > int ret = 0, final_ret; > > len = cros_ec_prepare_tx(ec_dev, ec_msg); > @@ -421,25 +422,22 @@ static int cros_ec_pkt_xfer_spi(struct > cros_ec_device *ec_dev, > if (!ret) { > /* Verify that EC can process command */ > for (i = 0; i < len; i++) { > - switch (rx_buf[i]) { > - case EC_SPI_PAST_END: > - case EC_SPI_RX_BAD_DATA: > - case EC_SPI_NOT_READY: > - ret = -EAGAIN; > - ec_msg->result = EC_RES_IN_PROGRESS; > - default: > + rx_byte = rx_buf[i]; > + if (rx_byte == EC_SPI_PAST_END || > + rx_byte == EC_SPI_RX_BAD_DATA || > + rx_byte == EC_SPI_NOT_READY) { > + ret = -EREMOTEIO; > break; > } > - if (ret) > - break; > } > - if (!ret) > - ret = cros_ec_spi_receive_packet(ec_dev, > - ec_msg->insize + sizeof(*response)); > - } else { > - dev_err(ec_dev->dev, "spi transfer failed: %d\n", ret); > } > > + if (!ret) > + ret = cros_ec_spi_receive_packet(ec_dev, > + ec_msg->insize + sizeof(*response)); > + else > + dev_err(ec_dev->dev, "spi transfer failed: %d\n", ret); > + > final_ret = terminate_request(ec_dev); > > spi_bus_unlock(ec_spi->spi->master); > Thanks! Works for me ... Tested-by: Jon Hunter <jonathanh@xxxxxxxxxx> Cheers Jon -- nvpublic -- To unsubscribe from this list: send the line "unsubscribe linux-tegra" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html