Re: [PATCH V4 4/5] soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 12/5/24 18:09, Cristian Marussi wrote:
On Thu, Dec 05, 2024 at 04:33:05PM +0530, Sibi Sankar wrote:


On 11/29/24 15:27, Shivnandan Kumar wrote:



Hi Sibi,

some rants down below :P

Hey Cristian,

Thanks for taking time to put out your thoughts!

On 10/7/2024 11:40 AM, Sibi Sankar wrote:
Introduce a client driver that uses the memlat algorithm string
hosted on QCOM SCMI Generic Extension Protocol to detect memory
latency workloads and control frequency/level of the various
memory buses (DDR/LLCC/DDR_QOS).


[snip]

+    /* Set the effective cpu frequency calculation method */
+    ret = ops->set_param(ph, &cpucp_freq_method,
sizeof(cpucp_freq_method), MEMLAT_ALGO_STR,
+                 MEMLAT_SET_EFFECTIVE_FREQ_METHOD);
+    if (ret < 0)
+        return dev_err_probe(&sdev->dev, ret,
+                     "failed to set effective frequency calc method\n");
+

Hi Sibi,

Hey Shiv,

Thanks for taking time to review the series!

Since the MEMLAT_SET_EFFECTIVE_FREQ_METHOD command is not supported in
the legacy CPUCP firmware, it should be kept optional. This way, if the
legacy firmware is used, the driver will not return an error when the
CPUCP firmware returns -EOPNOTSUPP.

The vendor protocol with the current major/minor version is
expected to work as is on x1e platforms. What legacy firmware
are you referring to? All future SoCs that plan to adhere to
it are expected to maintain this abi and can decide to make
use of alternate mechanisms to calculating frequency based
on additional dt properties set.


Normally in the SCMI world you could leverage protocol versioning and
the standard PROTOCOL_MESSAGE_ATTRIBUTES(0x2) command to let the agent
investigate if the SCMI server it is speaking to implements or NOT a
specific command and which specific version of that command is understood
(with possibly varying size and fields)...

...BUT since your vendor protocol is 'Generic' and, as it stands, it
basically piggybacks any kind of binary payload (i.e. subcommands of
some kind of subprotocols of yours) into the same 4 get/set/start/stop
'Generic' ops, without even specifying the transmitted/received payload
sizes into the message itself....all of the possible SCMI versioning
autodiscovery and backward-compatibility capabilities happily go out of
the window because:

- your versioning refers to the generic protocol and you cannot possibly
   describe all the possible future subcommands (opaque payloads) variations
   and/or subcommands addition/removals purely on the major/minor version, AND
   even if you did that, NONE of such future variations will be documented
   anywhere since you are hiding all of this inside a bunch of binary blobs

- you dont even specify the payload sizes of the tx/rx 'Generic' payload
   subcommands so it becomes even difficult for both the server and the
   client to safely handle your 'Generic' subcommand message payloads

- you cannot issue a PROTOCOL_MESSAGE_ATTRIBUTE() to see if a specific
   subcommand is supported, because your subcommand is NOT really a protocol
   command but it is just one of the payloads of one of the 'Generic' protocol:
   you commmands are only set/get/start/stop (maybe some sort of hack
   could be doen around these...bit all will be even more flaky...)

- you dont implement NEGOTIATE_PROTOCOL_VERSION and so you cannot even
   check if the SCMI server that you are speaking to will agree to
   downgrade and 'speak' your Kernel SCMI agent (possibly older) 'Generic'
   protocol version

All of this basically defeats all of the SCMI general capabilities
around versioning and auto-discovery when it comes to your 'Generic' vendor
protocol, with the end result that you will have to be EXTREMELY confident
to perfectly match and keep in sync at all times your SCMI Server(FW) and
Client(Kernel) sides, without any possibility of coping with a such a mismatch
on the field by using some of the fallback/downgrade mechanism that you
threw out of the window...

Even though we listed some of the background behind the generic
you have to understand it was done in a vacuum where QCOM might
have assumed that the entire vendor ID space was to be shared
across all vendors.

But your suggestions for adding another messages like NEGOTIATE
PROTOCOL_VERSION in a future major version upgrade would help
solve some of the problems you are listing out here.

Since there are devices that are out in the wild already running this
firmware, the first version of the generic vendor protocol is limited in
what it can accommodate. But like we already said we are open to changes
that will help review/maintain this for future SoCs without breakage.

(...and sorry but looking at the above xchange about 'legacy firmware' I am
  a bit skeptic about this as of now :D)

Some people within just used the wrong terminology here. There
are no "legacy" firmware since this is the first version of it
landing upstream and future versions have the options to implement
additional messages and ensure they do things in a way that is
easily reviewable/maintainable.

-Sibi


...as I said initially when reviewing this series, you can do whatever
you want within your Vendor protocol, but abusing the SCMI Vendor extensions
capabilities in such a way could end up bringing to the table more cons
(the above) than pros (some supposed 'simplification')

Thanks,
Cristian




[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux