Re: [PATCH] spi: spi-mem: add statistics support to ->exec_op() calls

Théo Lebrun <theo.lebrun@xxxxxxxxxxx> · Wed, 14 Feb 2024 11:59:49 +0100

On Wed Feb 14, 2024 at 10:29 AM CET, Tudor Ambarus wrote:
> On 2/14/24 08:51, Théo Lebrun wrote:
> > On Wed Feb 14, 2024 at 9:00 AM CET, Tudor Ambarus wrote:
> >> On 2/13/24 15:00, Théo Lebrun wrote:
> >>> On Tue Feb 13, 2024 at 1:39 PM CET, Tudor Ambarus wrote:
> >>>>>  /**
> >>>>>   * spi_mem_exec_op() - Execute a memory operation
> >>>>>   * @mem: the SPI memory
> >>>>> @@ -339,8 +383,12 @@ int spi_mem_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)
> >>>>>  		 * read path) and expect the core to use the regular SPI
> >>>>>  		 * interface in other cases.
> >>>>>  		 */
> >>>>> -		if (!ret || ret != -ENOTSUPP || ret != -EOPNOTSUPP)
> >>>>> +		if (!ret || ret != -ENOTSUPP || ret != -EOPNOTSUPP) {
> >>>>> +			spi_mem_add_op_stats(ctlr->pcpu_statistics, op, ret);
> >>>>> +			spi_mem_add_op_stats(mem->spi->pcpu_statistics, op, ret);
> >>>>> +
> >>>>
> >>>> Would be good to be able to opt out the statistics if one wants it.
> >>>>
> >>>> SPI NORs can write with a single write op maximum page_size bytes, which
> >>>> is typically 256 bytes. And since there are SPI NORs that can run at 400
> >>>> MHz, I guess some performance penalty shouldn't be excluded.
> >>>
> >>> I did my testing on a 40 MHz octal SPI NOR with most reads being much
> >>> bigger than 256 bytes, so I probably didn't have the fastest setup
> >>> indeed.
> >>
> >> yeah, reads are bigger, the entire flash can be read with a single read op.
> >>
> >>>
> >>> What shape would that take? A spi-mem DT prop? New field in the SPI
> >>> statistics sysfs directory?
> >>>
> >>
> >> I think I'd go with a sysfs entry, it provides flexibility. But I guess
> >> we can worry about this if we have some numbers, and I don't have, so
> >> you're fine even without the opt-out option.
> > 
> > Some ftrace numbers:
> > - 48002 calls to spi_mem_add_op_stats();
> > - min 1.053000µs;
> > - avg 1.175652µs;
> > - max 16.272000µs.
> > 
> > Platform is Mobileye EyeQ5. Cores are Imagine Technologies I6500-F. I
> > don't know the precision of our timer but we might be getting close to
> > what is measurable.
> > 
> Thanks.
>
> I took a random SPI NOR flash [1], its page program typical time is 64µs
> according to its SFDP data. We'll have to add here the delay the
> software handling takes.
>
> If you want to play a bit more, you can write the entire flash then
> compare the ftrace numbers of spi_mem_add_op_stats() with spi_nor_write().

It is unclear to me why you are focusing on writes? Won't reads be much
faster in the common case, and therefore where stats overhead would
show the most? For cadence-qspi, only issuing command reads (reads below
8 bytes) would be a sort of pathological case.

Thanks,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com