Re: topics for the file system mini-summit

Matthew Wilcox <matthew@xxxxxx> · Sun, 28 May 2006 18:11:03 -0600

On Thu, May 25, 2006 at 02:44:50PM -0700, Ric Wheeler wrote:
> The obvious alternative to this is to break up these big disks into
> multiple small file systems, but there again we hit several issues.
> 
> As an example, in one of the boxes that I work with we have 4 drives,
> each 500GBs, with limited memory and CPU resources. To address the
> issues above, we break each drive into 100GB chunks which gives us 20
> (reiserfs) file systems per box.  The set of new problems that arise
> from this include:
> 
>    (1) no forced unmount - one file system goes down, you have to
> reboot the box to recover.
>    (2) worst case memory consumption for the journal scales linearly
> with the number of file systems (32MB/per file system).
>    (3) we take away the ability of the file system to do intelligent
> head movement on the drives (i.e., I end up begging the application team
> to please only use one file system per drive at a time for ingest ;-)).
> The same goes for allocation - we basically have to push this up to the
> application to use the capacity in an even way.
>    (4) pain of administration of multiple file systems.
> 
> I know that other file systems deal with scale better, but the question
> is really how to move the mass of linux users onto these large and
> increasingly common storage devices in a way that handles these challenges.

How do you handle the inode number space?  Do you partition it across
the sub-filesystems, or do you prohibit hardlinks between the sub-fses?
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html