Hi Doug, On 07/11/17 17:22, Doug Anderson wrote: > On Tue, Nov 7, 2017 at 3:28 AM, Jon Hunter <jonathanh@xxxxxxxxxx> wrote: >> On 10/10/17 17:52, Doug Anderson wrote: >> >> ... >> >>>> I'm still not clear on why we see an error only on the first >>>> transaction after boot. In this case, the embedded controller >>>> previously handled host commands from firmware just fine, and the >>>> handoff between firmware and the kernel shouldn't be significant, from >>>> the EC point of view. If host command timing is consistent from the >>>> master, I would expect to see some constant error rate, eg. some >>>> chance any host command will fail, rather than the first host command >>>> always failing. >>> >>> The AP itself is often quite busy at boot and so the timings for >>> everything change. That could easily explain the problems. >> >> Sorry for the delay, but I have finally had some time to look at this a >> bit closer. I have been able to track down where the additional delay is >> really needed and seems to explain what is going on. >> >> For starters, the SPI chip-select is under h/w control and so the >> software delay has no impact on the timing between the chip-select >> going active and the transaction starting as I had first thought. >> >> I found that a delay is needed between completing the probe the Tegra >> SPI device and the first SPI transaction issued to the EC. In the Tegra >> SPI probe the SPI controller is programmed for master mode, but at the >> same time it clears the chip-select inactive polarity bit meaning that >> initially the SPI chip-select default to active-high (rather that low >> which seems odd). I believe that this then drives the chip-select low >> (active for the EC) and until it is then configured when spi_setup() is >> called which configures it as active-low for the EC. >> To get the first transaction to work for the EC there needs to be a >> delay after we program the chip-select polarity when spi_setup() is >> called. For example ... >> >> diff --git a/drivers/mfd/cros_ec_spi.c b/drivers/mfd/cros_ec_spi.c >> index 584367f3a0ed..c1075c3c60c8 100644 >> --- a/drivers/mfd/cros_ec_spi.c >> +++ b/drivers/mfd/cros_ec_spi.c >> @@ -648,6 +648,8 @@ static int cros_ec_spi_probe(struct spi_device *spi) >> if (err < 0) >> return err; >> >> + udelay(100); >> + > > This isn't totally crazy, but actually you could probably do this: > > ec_spi->last_transfer_ns = ktime_get_ns(); > > ...that will leverage already existing code and constants and also > will avoid doing a delay if it wasn't needed. You could also then get > rid of some "if (ec_spi->last_transfer_ns)" tests in the code. I'd > support landing that. > > >> ec_spi = devm_kzalloc(dev, sizeof(*ec_spi), GFP_KERNEL); >> if (ec_spi == NULL) >> return -ENOMEM; >> OK, yes and that does work well too. I will send a patch with the following for review. diff --git a/drivers/mfd/cros_ec_spi.c b/drivers/mfd/cros_ec_spi.c index 584367f3a0ed..477f8e81dc34 100644 --- a/drivers/mfd/cros_ec_spi.c +++ b/drivers/mfd/cros_ec_spi.c @@ -671,6 +671,7 @@ static int cros_ec_spi_probe(struct spi_device *spi) sizeof(struct ec_response_get_protocol_info); ec_dev->dout_size = sizeof(struct ec_host_request); + ec_spi->last_transfer_ns = ktime_get_ns(); err = cros_ec_register(ec_dev); if (err) { >> You may say why not put a delay in the tegra_spi_setup() itself, but we >> have some other SPI devices such as flash devices which are also use an >> active-low chip-select which don't have any issues with this to date. >> Furthermore, this delay is also probably device specific. >> >> From an EC perspective, if the chip-select is asserted is there a >> turnaround time before it can be asserted again? Or in this case maybe >> the issue is that the chip-select is asserted but no transaction occurs >> before it is de-asserted and so is causing a problem. Please note that a >> delay of around ~50us above still fails but 100us seems to be ok. > > Really nice detective work! > > >> Finally, this also works-around the problem by avoiding the chip-select >> from being pulsed low by defaulting all chip-selects to active-low but >> maybe this just masks the problem. >> >> diff --git a/drivers/spi/spi-tegra114.c b/drivers/spi/spi-tegra114.c >> index a76acedd7e2f..7c18204e61d9 100644 >> --- a/drivers/spi/spi-tegra114.c >> +++ b/drivers/spi/spi-tegra114.c >> @@ -1117,7 +1117,7 @@ static int tegra_spi_probe(struct platform_device >> goto exit_pm_disable; >> } >> - tspi->def_command1_reg = SPI_M_S; >> + tspi->def_command1_reg = SPI_M_S | SPI_CS_POL_INACTIVE_MASK; > > Note that even though I'd support adding some sort of delay to the > cros_ec driver, I'd also say that it _might_ make sense to mess with > the SPI driver too, just to avoid glitching the lines at bootup. As > you said, you shouldn't just willy nilly change the default but it > seems like it could make sense to define an initial (board level) > pinctrl state that's in effect until the first call to setup_bus(). I am thinking about making that above change as well, because the reset value of the chip-select polarity bits default to active-low. For active-high devices I don't think that you can ever avoid the chip-select asserting for a period after reset as that is the default setting, but I don't see why we are clearing these bits in probe. I will do a bit more testing with this to avoid any regressions, but both changes seem worthwhile IMO. Cheers Jon -- nvpublic -- To unsubscribe from this list: send the line "unsubscribe linux-tegra" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html