Re: Re: Re: RE(2): FW: [LSF/MM/BPF TOPIC] SMDK inspired MM changes for CXL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Matthew Wilcox wrote:
> On Tue, Apr 04, 2023 at 09:48:41PM -0700, Dan Williams wrote:
> > Kyungsan Kim wrote:
> > > We know the situation. When a CXL DRAM channel is located under ZONE_NORMAL,
> > > a random allocation of a kernel object by calling kmalloc() siblings makes the entire CXL DRAM unremovable.
> > > Also, not all kernel objects can be allocated from ZONE_MOVABLE.
> > > 
> > > ZONE_EXMEM does not confine a movability attribute(movable or unmovable), rather it allows a calling context can decide it.
> > > In that aspect, it is the same with ZONE_NORMAL but ZONE_EXMEM works for extended memory device.
> > > It does not mean ZONE_EXMEM support both movability and kernel object allocation at the same time.
> > > In case multiple CXL DRAM channels are connected, we think a memory consumer possibly dedicate a channel for movable or unmovable purpose.
> > > 
> > 
> > I want to clarify that I expect the number of people doing physical CXL
> > hotplug of whole devices to be small compared to dynamic capacity
> > devices (DCD). DCD is a new feature of the CXL 3.0 specification where a
> > device maps 1 or more thinly provisioned memory regions that have
> > individual extents get populated and depopulated by a fabric manager.
> > 
> > In that scenario there is a semantic where the fabric manager hands out
> > 100G to a host and asks for it back, it is within the protocol that the
> > host can say "I can give 97GB back now, come back and ask again if you
> > need that last 3GB".
> 
> Presumably it can't give back arbitrary chunks of that 100GB?  There's
> some granularity that's preferred; maybe on 1GB boundaries or something?

The device picks a granularity that can be tiny per spec, but it makes
the hardware more expensive to track in small extents, so I expect
something reasonable like 1GB, but time will tell once actual devices
start showing up.

> > In other words even pinned pages in ZONE_MOVABLE are not fatal to the
> > flow. Alternatively, if a deployment needs 100% guarantees that the host
> > will return all the memory it was assigned when asked there is always
> > the option to keep that memory out of the page allocator and just access
> > it via a device. That's the role device-dax plays for "dedicated" memory
> > that needs to be set aside from kernel allocations.
> > 
> > This is to say something like ZONE_PREFER_MOVABLE semantics can be
> > handled within the DCD protocol, where 100% unpluggability is not
> > necessary and 97% is good enough.
> 
> This certainly makes life better (and rather more like hypervisor
> shrinking than like DIMM hotplug), but I think fragmentation may well
> result in "only 3GB of 100GB allocated" will result in being able to
> return less than 50% of the memory, depending on granule size and
> exactly how the allocations got chunked.

Agree.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux