Re: Re: [LSF/MM/BPF TOPIC] SMDK inspired MM changes for CXL

Kyungsan Kim <ks0204.kim@xxxxxxxxxxx> · Fri, 31 Mar 2023 20:46:49 +0900

Hi Dragan Stancevic.
Thank you for your interests and joning the discussion.

>On 2/20/23 19:41, Kyungsan Kim wrote:
>> CXL is a promising technology that leads to fundamental changes in computing architecture.
>> To facilitate adoption and widespread of CXL memory, we are developing a memory tiering solution, called SMDK[1][2].
>> Using SMDK and CXL RAM device, our team has been working with industry and academic partners over last year.
>> Also, thanks to many researcher's effort, CXL adoption stage is gradually moving forward from basic enablement to real-world composite usecases.
>> At this moment, based on the researches and experiences gained working on SMDK, we would like to suggest a session at LSF/MM/BFP this year
>> to propose possible Linux MM changes with a brief of SMDK.
>> 
>> Adam Manzanares kindly adviced me that it is preferred to discuss implementation details on given problem and consensus at LSF/MM/BFP.
>> Considering the adoption stage of CXL technology, however, let me suggest a design level discussion on the two MM expansions of SMDK this year.
>> When we have design consensus with participants, we want to continue follow-up discussions with additional implementation details, hopefully.
>> 
>>   
>> 1. A new zone, ZONE_EXMEM
>> We added ZONE_EXMEM to manage CXL RAM device(s), separated from ZONE_NORMAL for usual DRAM due to the three reasons below.
>
>Hi Kyungsan-
>
>I read through your links and I am very interested in this 
>talk/discussion from the perspective of cloud/virtualization hypervisor 
>loads.
>
>The problem that I am starting to tackle is clustering of hypervisors 
>over cxl.mem for high availability of virtual machines. Or live 
>migration of virtual machines between hypervisors using cxl.mem [1].
>
>
>So I was wondering, with regards to the ZONE_XMEM, has any thought been 
>given to the shared memory across virtual hierarchies [2], where you 
>have cxl.mem access over cxl switches by multiple VH connections. It 
>seems to me that there might be a need for differentiation of direct 
>cxl.mem and switched cxl.mem. At least from the point of view where you 
>have multiple hypervisors sharing the memory over a switch. Where they 
>would potentially have to synchronize state/metadata about the memory.

At first, in general we have thought that more SW layers(baremetal, virtualization, orchestration) would be related
along with the progress of CXL topology(direct attached, switch/multilevel switch, rackscale/inter-rackscale with fabric).
We think ZONE_EXMEM can be used as a static CXL identifier between hypervisor and host OS interaction for memory inflation/deflation, transcendent memory interface(frontswap/cleancache)[1], and isolation.

[1] https://lwn.net/Articles/454795

>
>[1] A high-level explanation is at https://protect2.fireeye.com/v1/url?k=6962eb99-098076c4-696360d6-000babd9f1ba-f4ae8300c44044a7&q=1&e=fca5fea0-6b57-4874-8ec1-637a6c1019b6&u=http%3A%2F%2Fnil-migration.org%2F
>[2] Compute Express Link Specification r3.0, v1.0 8/1/22, Page 51, 
>figure 1-4, black color scheme circle(3) and bars.
>
>
>--
>Peace can only come as a natural consequence
>of universal enlightenment -Dr. Nikola Tesla