On Sat, 22 Apr 2023 00:46:40 +0300, jarkko@xxxxxxxxxx wrote: > On Fri, Mar 31, 2023 at 02:56:25PM +0800, shaopeijie@xxxxxxxx wrote: >> From: Peijie Shao <shaopeijie@xxxxxxxx> >> >> The TPM's chip select will leave active after spi_bus_unlock when >> flow control timeout, and may interfere other chips sharing the same >> spi bus, or may damage them dule to level conflict on MISO pin. >> >> So the patch deactives the chip select by sending an empty message >> with cs_change=0 if flow control fails. >> >> The reason why flow control timeout for me is unfortunately I got a >> damaged TPM chip. It always pull MISO low during cs active(this can >> be easily emulated by wire MISO to the ground), not responding anything, >> and dmesg shows below: >> ... >> [ 311.150725] tpm_tis_spi: probe of spi0.0 failed with error -110 >> ... > > We don't really cease to support damaged hardware but it is true > that the *software* failure paths should probably be robust enough > to deativate chip select. > > I would rewrite this as > > "The failure paths in tpm_tis_spi_transfer() do not deactivate > chip select. Send an empty message (cs_select == 0) to overcome > this." > > That's all there needs to be. We do not care about broken hardware. > I agree. The patch is to robust *software* path, but not support broken hardware. >> >> Signed-off-by: Peijie Shao <shaopeijie@xxxxxxxx> >> --- >> drivers/char/tpm/tpm_tis_spi_main.c | 13 ++++++++++++- >> 1 file changed, 12 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/char/tpm/tpm_tis_spi_main.c b/drivers/char/tpm/tpm_tis_spi_main.c >> index a0963a3e92bd..5c8ff343761f 100644 >> --- a/drivers/char/tpm/tpm_tis_spi_main.c >> +++ b/drivers/char/tpm/tpm_tis_spi_main.c >> @@ -105,8 +105,19 @@ int tpm_tis_spi_transfer(struct tpm_tis_data *data, u32 addr, u16 len, >> /* Flow control transfers are receive only */ >> spi_xfer.tx_buf = NULL; >> ret = phy->flow_control(phy, &spi_xfer); >> - if (ret < 0) >> + if (ret < 0) { >> + /* >> + * Release cs pin if the device is not responding, regardless of the reason. >> + * Notice cs may alreadly been released if the failure was caused inside >> + * spi_sync_locked called by flow_control, in this situation, a pluse may be >> + * generated on cs. >> + */ > > Please replace above comment with: > > /* Deactivate chip select: */ > I agree. >> + memset(&spi_xfer, 0, sizeof(spi_xfer)); >> + spi_message_init(&m); >> + spi_message_add_tail(&spi_xfer, &m); >> + spi_sync_locked(phy->spi_device, &m); >> goto exit; >> + } >> >> spi_xfer.cs_change = 0; >> spi_xfer.len = transfer_len; >> -- >> 2.39.1 >> >> >> > > There's three call sites, why are you taking care of only one > of them? > > I'd consider instead: > > return 0; > > exit: > memset(&spi_xfer, 0, sizeof(spi_xfer)); > spi_message_init(&m); > spi_message_add_tail(&spi_xfer, &m); > spi_sync_locked(phy->spi_device, &m); > spi_bus_unlock(phy->spi_device->master); > return ret; > } I found that spi_sync_locked() will deactivate cs if any failure is generated inside it. spi_sync_locked() ... spi_transfer_one_message() ... if (ret != 0 || !keep_cs) spi_set_cs(msg->spi, false, false); spi_transfer_one_message() is the default transfer method, I think other spi controllers who implements .transfer_one_message should have the same behaviour. Sending an empty message when cs is already deactivated may have a small side effect: cs will go activate and deactivate in a very short time, means a pulse will be generated on cs pin. This may also happen when failure is generated by spi_sync_locked() which called inside ->flow_control(). So to reduce this, I prefer deactivating cs only when phy->flow_control() fails. If the side effect is totaly acceptable, your advice maybe better. Waiting for your suggestions. > > The the rollback would apply to all call sites. > > BR, Jarkko Thanks! Peijie, Shao