On Fri, Feb 23, 2024 at 11:41:44AM -0600, John Groves wrote: > This patch set introduces famfs[1] - a special-purpose fs-dax file system > for sharable disaggregated or fabric-attached memory (FAM). Famfs is not > CXL-specific in anyway way. > > * Famfs creates a simple access method for storing and sharing data in > sharable memory. The memory is exposed and accessed as memory-mappable > dax files. > * Famfs supports multiple hosts mounting the same file system from the > same memory (something existing fs-dax file systems don't do). > * A famfs file system can be created on either a /dev/pmem device in fs-dax > mode, or a /dev/dax device in devdax mode (the latter depending on > patches 2-6 of this series). > > The famfs kernel file system is part the famfs framework; additional > components in user space[2] handle metadata and direct the famfs kernel > module to instantiate files that map to specific memory. The famfs user > space has documentation and a reasonably thorough test suite. > > The famfs kernel module never accesses the shared memory directly (either > data or metadata). Because of this, shared memory managed by the famfs > framework does not create a RAS "blast radius" problem that should be able > to crash or de-stabilize the kernel. Poison or timeouts in famfs memory > can be expected to kill apps via SIGBUS and cause mounts to be disabled > due to memory failure notifications. > > Famfs does not attempt to solve concurrency or coherency problems for apps, > although it does solve these problems in regard to its own data structures. > Apps may encounter hard concurrency problems, but there are use cases that > are imminently useful and uncomplicated from a concurrency perspective: > serial sharing is one (only one host at a time has access), and read-only > concurrent sharing is another (all hosts can read-cache without worry). Can you do me a favor, curious if you can run a test like this: fio -name=ten-1g-per-thread --nrfiles=10 -bs=2M -ioengine=io_uring -direct=1 --group_reporting=1 --alloc-size=1048576 --filesize=1GiB --readwrite=write --fallocate=none --numjobs=$(nproc) --create_on_open=1 --directory=/mnt What do you get for throughput? The absolute large the system an capacity the better. Luis