Re: [xfs_check Out of memory: ]

Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx> · Sat, 28 Dec 2013 18:54:20 -0600

On 12/28/2013 5:39 PM, Arkadiusz Miśkiewicz wrote:
> On Saturday 28 of December 2013, Stan Hoeppner wrote:
>> On 12/27/2013 5:20 PM, Arkadiusz Miśkiewicz wrote:
>> ...
>>
>>> - can't add more RAM easily, machine is at remote location, uses obsolete
>>> DDR2, have no more ram slots and so on
>>
>> ...
>>
>>> So looks like my future backup servers will need to have 64GB, 128GB or
>>> maybe even more ram that will be there only for xfs_repair usage. That's
>>> gigantic waste of resources. And there are modern processors that don't
>>> work with more than 32GB of ram - like "Intel Xeon E3-1220v2" (
>>> http://tnij.org/tkqas9e ). So adding ram means replacing CPU, likely
>>> replacing mainboard. Fun :)
>>
>> ..
>>
>>> IMO ram usage is a real problem for xfs_repair and there has to be some
>>> upstream solution other than "buy more" (and waste more) approach.
>>
>> The problem isn't xfs_repair.  
> 
> This problem is fully solvable on xfs_repair side (if disk space outside of 
> broken xfs fs is available).
> 
>> The problem is that you expect this tool
>> to handle an infinite number of inodes while using a finite amount of
>> memory, or at least somewhat less memory than you have installed.  We
>> don't see your problem reported very often which seems to indicate your
>> situation is a corner case, or that others simply
> 
> It's not something common. Happens from time to time judging based on #xfs 
> questions.
> 
>> size their systems
>> properly without complaint.
> 
> I guess having milions of tiny files (few kb each file) in simply not 
> something common rather than "properly sizing systems".
> 
>> If you'd actually like advice on how to solve this, today, with
>> realistic solutions, in lieu of the devs recoding xfs_repair for the
>> single goal of using less memory, then here are your options:
>>
>> 1.  Rewrite or redo your workload to not create so many small files,
>>     so many inodes, i.e. use a database
> 
> It's a backup copy that needs to be directly accessible (so you could run 
> production directly from backup server for example).  That solution won't 
> work.

So it's an rsnapshot server and you have many millions of hardlinks.
The obvious solution here is to simply use a greater number of smaller
XFS filesystems with fewer hardlinks in each.  This is by far the best
way to avoid the xfs_repair memory consumption issue due to massive
inode count.

You might even be able to accomplish this using sparse files.  This
would preclude the need to repartition your storage for more
filesystems, and would allow better utilization of your storage.  Dave
is the sparse filesystem expert so I'll defer to him on whether this is
possible, or applicable to your workload.

>> 2.  Add more RAM to the system
> 
>> 3.  Add an SSD of sufficient size/speed for swap duty to handle
>>     xfs_repair requirements for filesystems with arbitrarily high
>>     inode counts
> 
> That would work... if the server was locally available.
> 
> Right now my working "solution" is:
> - add 40GB of swap space
> - stop all other services
> - run xfs_repair, leave it for 1-2 days
> 
> Adding SSD is my only long term option it seems.

It's not a perfect solution by any means, and the SSD you choose matters
greatly, which I why I recommended the Samsung 840 Pro.  More RAM is the
best option with your current setup, but is not available for your
system.  Using more filesystems with fewer inodes in each is by far the
best option, WRT xfs_repair and limited memory.

>> The fact that the systems are remote, that you have no more DIMM slots,
>> are not good arguments for you to make in this context.  Every system
>> will require some type of hardware addition/replacement/maintenance.
>> And this is not the first software "problem" that requires more hardware
>> to solve.  If your application that creates these millions of files
>> needed twice as much RAM, forcing an upgrade, would you be complaining
>> this way on their mailing list?
> 
> If that application could do its job without requiring 2xRAM then surely I 
> would write about this to ml.
> 
>> If so I'd suggest the problem lay
>> somewhere other than xfs_repair and that application.
> 
> IMO this problem could be solved on xfs_repair side but well... someone would 
> have to write patches and that's unlikely to happen.
> 
> So now more important question. How to actually estimate these things? 
> Example: 10TB xfs filesystem fully written with files - 10kb each file (html 
> pages, images etc) - web server. How much ram my server would need for repair 
> to succeed?

One method is to simply ask xfs_repair how much memory it needs to
repair the filesystem.  Usage:

$ umount /mount/point
$ xfs_repair -n -m 1 -vv /mount/point
$ mount /mount/point

e.g.

$ umount /dev/sda7
$ xfs_repair -n -m 1 -vv /dev/sda7
Phase 1 - find and verify superblock...
        - max_mem = 1024, icount = 85440, imem = 333, dblock =
          24414775, dmem = 11921
Required memory for repair is greater that the maximum specified with
the -m option. Please increase it to at least 60.
$ mount /dev/sda7

This is a 100GB inode32 test filesystem with 83K inodes.  xfs_repair
tells us it requires a minimum 60MB of memory for this filesystem.  This
is a minimum.  The actual repair may require more, but the figure given
should be pretty close.

-- 
Stan

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs