Re: Linux swapping with MySQL/InnoDB due to NUMA architecture imbalanced allocations?

Dave Hansen <dave@xxxxxxxxxxxxxxxxxx> · Fri, 24 Sep 2010 11:37:49 -0700

On Thu, 2010-09-23 at 15:29 -0700, Jeremy Cole wrote:
> 1. Is it plausible that Linux for whatever reason needs memory to be
> in Node 0, and chooses to page out used memory to make room, rather
> than choosing to drop some of the cache in Node 1 and use that memory?
>  I think this is true, but maybe I've missed something important.

Your situation sounds pretty familiar.  It happens a lot when
applications are moved over to a NUMA system for the first time.  Your
interleaving solution is a decent one, although teaching the database
about NUMA is a much better long-term approach.

As far as the decisions about running reclaim or swapping versus going
to another node for an allocation, take a look at the
"zone_reclaim_mode" bits in Documentation/sysctl/vm.txt .  It does a
decent job of explaining what we do.

Most users new to NUMA systems just prefer to "echo 0 >
zone_reclaim_mode".  I've also run into a fair number of "tuning" guides
that say to do this.  It will make the allocator act a lot more like if
NUMA wasn't there.  It isn't as _optimized_ for NUMA locality then, but
it does tend to let you allocate memory more freely.

-- Dave

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>