Re: [External] [LSF/MM/BPF TOPIC] CXL Fabric Manager (FM) architecture

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Feb 9, 2023, at 2:10 PM, Adam Manzanares <a.manzanares@xxxxxxxxxxx> wrote:
> 
> On Wed, Feb 08, 2023 at 10:03:57AM -0800, Viacheslav A.Dubeyko wrote:
>> 
>> 
>>> On Feb 8, 2023, at 8:38 AM, Adam Manzanares <a.manzanares@xxxxxxxxxxx> wrote:
>>> 
>>> On Thu, Feb 02, 2023 at 09:54:02AM +0000, Jonathan Cameron wrote:
>>>> On Wed, 1 Feb 2023 12:04:56 -0800
>>>> "Viacheslav A.Dubeyko" <viacheslav.dubeyko@xxxxxxxxxxxxx> wrote:
>>>> 
>>>>>> 
>> 
>> <skipped>
>> 
>>>>> 
>>>>> Most probably, we will have multiple FM implementations in firmware.
>>>>> Yes, FM on host could be important for debug and to verify correctness
>>>>> firmware-based implementations. But FM daemon on host could be important
>>>>> to receive notifications and react somehow on these events. Also, journalling
>>>>> of events/messages/events could be important responsibility of FM daemon
>>>>> on host. 
>>>> 
>>>> I agree with an FM daemon somewhere (potentially running on the BMC type chip
>>>> that also has the lower level FM-API access).  I think it is somewhat
>>>> separate from the rest of this on basis it may well just be talking redfish
>>>> to the FM and there are lots of tools for that sort of handling already.
>>>> 
>>> 
>>> I would be interested in particpating in a BOF about this topic. I wonder what
>>> happens when we have multiple switches with multiple FMs each on a separate BMC.
>>> In this case, does it make more sense to have an owner of the global FM state 
>>> be a user space application. Is this the job of the orchestrator?
>>> 
>>> The BMC based FM seems to have scalability issues, but will we hit them in
>>> practice any time soon.
>> 
>> I had discussion recently and it looks like there are interesting points:
>> (1) If we have multiple CXL switches (especially with complex hierarchy), then it is
>> very compute-intensive activity. So, potentially, FM on firmware side could be not
>> capable to digest and executes all responsibilities without potential performance
>> degradation.
>> (2) However, if we have FM on host side, then there is security concerns because
>> FM sees everything and all details of multiple hosts and subsystems.
>> (3) Technically speaking, there is one potential capability that user-space FM daemon
>> can run as on host side as on CXL switch side. I mean here that if we implement
>> user-space FM daemon, then it could be used to execute FM functionality on CXL
>> switch side (maybe????). :)
>> 
>> <skipped>
>> 
>>>>>>>  - Manage surprise removal of devices  
>>>>>> 
>>>>>> Likewise, beyond reporting I wouldn't expect the FM daemon to have any idea
>>>>>> what to do in the way of managing this.  Scream loudly?
>>>>>> 
>>>>> 
>>>>> Maybe, it could require application(s) notification. Let’s imagine that application
>>>>> uses some resources from removed device. Maybe, FM can manage kernel-space
>>>>> metadata correction and helping to manage application requests to not existing
>>>>> entities.
>>>> 
>>>> Notifications for the host are likely to come via inband means - so type3 driver
>>>> handling rather than related to FM.  As far as the host is concerned this is the
>>>> same as case where there is no FM and someone ripped a device out.
>>>> 
>>>> There might indeed be meta data to manage, but doubt it will have anything to
>>>> do with kernel.
>>>> 
>>> 
>>> I've also had similar thoughts, I think the OS responds to notifications that
>>> are generated in-band after changes to the state of the FM are made through 
>>> OOB means.
>>> 
>>> I envision the host sends REDFISH requests to a switch BMC that has an FM
>>> implementation. Once the changes are implemented by the FM it would show up
>>> as changes to the PCIe hierarchy on a host, which is capable of responding to
>>> such changes.
>>> 
>> 
>> I think I am not completely follow your point. :) First of all, I assume that if host
>> sends REDFISH request, then it will be expected the confirmation of request execution.
>> It means for me that host needs to receive some packet that informs that request
>> executed successfully or failed. It means that some subsystem or application requested
>> this change and only after receiving the confirmation requested capabilities can be used.
>> And if FM is on CXL switch side, then how FM will show up the changes? It sounds for me
>> that some FM subsystem should be on the host side to receive confirmation/notification
>> and to execute the real changes in PCIe hierarchy. Am missing something here?
> 
> Hopefully I have a point ;). I do expect a host to receive a response  for a
> given REDFISH request, but the request/response would be OOB. I would go back
> to the example of hot plugging in a PCIe based devices. For example if an nvme
> SSD is hot plugged, then the OS notified by HW that a new PCIe device has been
> added. Going back to changes made by the FM, if the changes impact the CXL
> hiearchy that is visible to a host, it is my expectation that the host OS will
> be informed of the changes requested of the FM when the host HW becomes aware
> of the changes (the in-band change).
> 

You are right if we talk about hardware directly connected to the host. It means that
CPU (or any other hardware subsystem) can receive interrupt and kernel can process
this hardware change. But FM can be remote and be shared by multiple hosts.
In such case, we need to have some software subsystem on host(s) side that can
execute polling or expects to receive network packet with notification or confirmation
of the change. Or we need to have some hardware subsystem on every host that
can interact with remote FM in the background and issues the interrupt locally
with the goal to refresh kernel metadata.

Thanks,
Slava.






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux