Re: [LSF/MM/BPF TOPIC] Design challenges for a new file system that needs to support multiple billions of file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2/4/25 2:47 AM, Dave Chinner wrote:
On Mon, Feb 03, 2025 at 05:18:48PM +0100, Ric Wheeler wrote:
On 2/3/25 4:22 PM, Amir Goldstein wrote:
On Sun, Feb 2, 2025 at 10:40 PM RIc Wheeler <ricwheeler@xxxxxxxxx> wrote:
I have always been super interested in how much we can push the
scalability limits of file systems and for the workloads we need to
support, we need to scale up to supporting absolutely ridiculously large
numbers of files (a few billion files doesn't meet the need of the
largest customers we support).

Hi Ric,

Since LSFMM is not about presentations, it would be better if the topic to
discuss was trying to address specific technical questions that developers
could discuss.
Totally agree - from the ancient history of LSF (before MM or BPF!) we also
pushed for discussions over talks.

If a topic cannot generate a discussion on the list, it is not very
likely that it will
generate a discussion on-prem.

Where does the scaling with the number of files in a filesystem affect existing
filesystems? What are the limitations that you need to overcome?
Local file systems like xfs running on "scale up" giant systems (think of
the old super sized HP Superdomes and the like) would be likely to handle
this well.
We don't need "Big Iron" hardware to scale up to tens of billions of
files in a single filesystem these days. A cheap server with 32p and
a couple of hundred GB of RAM and a few NVMe SSDs is all that is
really needed. We recently had a XFS user report over 16 billion
files in a relatively small filesystem (a few tens of TB), most of
which were reflink copied files (backup/archival storage farm).

So, yeah, large file counts (i.e. tens of billions) in production
systems aren't a big deal these days. There shouldn't be any
specific issues at the OS/VFS layers supporting filesystems with
inode counts in the billions - most of the problems with this are
internal fielsystem implementation issues. If there are any specific
VFS level scalability issues you've come across, I'm all ears...

-Dave.

I remember fondly torturing xfs (and ext4 and btrfs) many years back with a billion small (empty) files on a sata drive :)

For our workload though, we have a couple of requirements that prevent most customers from using a single server.

First requirement is the need to keep a scary number of large tape drives/robots running at line rate - keeping all of those busy normally requires order of 5 servers with our existing stack but larger systems can need more.

Second requirement is the need for high availability - that lead us to using a shared disk back file system (scoutfs) - but others in this space have used cxfs and similar non-open source file systems. The shared disk/cluster file systems are where the coarse grain locking comes into conflict with concurrency.

What ngnfs is driving towards is to be able to drive that bandwidth requirement for the backend archival work flow, support the many billions of file objects in a high availability system made with today's cutting edge components.  Zach will jump in once he gets back but my hand wavy way of thinking of this is that ngnfs as a distributed file system is closer in design to how xfs would run on a huge system with coherence between NUMA zones.

regards,

Ric






[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux