Re: [PATCH] spi: spi-mem: add statistics support to ->exec_op() calls

Tudor Ambarus <tudor.ambarus@xxxxxxxxxx> · Wed, 14 Feb 2024 12:59:31 +0000

On 2/14/24 10:59, Théo Lebrun wrote:
> On Wed Feb 14, 2024 at 10:29 AM CET, Tudor Ambarus wrote:
>> On 2/14/24 08:51, Théo Lebrun wrote:
>>> On Wed Feb 14, 2024 at 9:00 AM CET, Tudor Ambarus wrote:
>>>> On 2/13/24 15:00, Théo Lebrun wrote:
>>>>> On Tue Feb 13, 2024 at 1:39 PM CET, Tudor Ambarus wrote:
>>>>>>>  /**
>>>>>>>   * spi_mem_exec_op() - Execute a memory operation
>>>>>>>   * @mem: the SPI memory
>>>>>>> @@ -339,8 +383,12 @@ int spi_mem_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)
>>>>>>>  		 * read path) and expect the core to use the regular SPI
>>>>>>>  		 * interface in other cases.
>>>>>>>  		 */
>>>>>>> -		if (!ret || ret != -ENOTSUPP || ret != -EOPNOTSUPP)
>>>>>>> +		if (!ret || ret != -ENOTSUPP || ret != -EOPNOTSUPP) {
>>>>>>> +			spi_mem_add_op_stats(ctlr->pcpu_statistics, op, ret);
>>>>>>> +			spi_mem_add_op_stats(mem->spi->pcpu_statistics, op, ret);
>>>>>>> +
>>>>>>
>>>>>> Would be good to be able to opt out the statistics if one wants it.
>>>>>>
>>>>>> SPI NORs can write with a single write op maximum page_size bytes, which
>>>>>> is typically 256 bytes. And since there are SPI NORs that can run at 400
>>>>>> MHz, I guess some performance penalty shouldn't be excluded.
>>>>>
>>>>> I did my testing on a 40 MHz octal SPI NOR with most reads being much
>>>>> bigger than 256 bytes, so I probably didn't have the fastest setup
>>>>> indeed.
>>>>
>>>> yeah, reads are bigger, the entire flash can be read with a single read op.
>>>>
>>>>>
>>>>> What shape would that take? A spi-mem DT prop? New field in the SPI
>>>>> statistics sysfs directory?
>>>>>
>>>>
>>>> I think I'd go with a sysfs entry, it provides flexibility. But I guess
>>>> we can worry about this if we have some numbers, and I don't have, so
>>>> you're fine even without the opt-out option.
>>>
>>> Some ftrace numbers:
>>> - 48002 calls to spi_mem_add_op_stats();
>>> - min 1.053000µs;
>>> - avg 1.175652µs;
>>> - max 16.272000µs.
>>>
>>> Platform is Mobileye EyeQ5. Cores are Imagine Technologies I6500-F. I
>>> don't know the precision of our timer but we might be getting close to
>>> what is measurable.
>>>
>> Thanks.
>>
>> I took a random SPI NOR flash [1], its page program typical time is 64µs
>> according to its SFDP data. We'll have to add here the delay the
>> software handling takes.
>>
>> If you want to play a bit more, you can write the entire flash then
>> compare the ftrace numbers of spi_mem_add_op_stats() with spi_nor_write().
> 
> It is unclear to me why you are focusing on writes? Won't reads be much

It's easier to test as the SPI NOR core will issue a new page program
operation, thus a new exec_op() call, for each page size.

> faster in the common case, and therefore where stats overhead would
> show the most? For cadence-qspi, only issuing command reads (reads below
> 8 bytes) would be a sort of pathological case.

If you can serialize the reads and do small 8 bytes requests, then yes,
small reads are the critical case. Not sure how common it is and how to
test it.

Again, opting out is not a hard requirement from my side, you're fine
with how the patch is now. But if you want to measure the impact, you
can either compare with small reads, as you suggested, and with full
flash write, where the exec_op() calls will be split by the core on a
page_size basis.

Cheers,
ta