896MB address limit

cpw@xxxxxxx (Cliff Wickman) · Thu, 27 Sep 2012 18:07:12 -0500

On Tue, Sep 25, 2012 at 08:10:04AM -0700, Eric W. Biederman wrote:
> Cliff Wickman <cpw at sgi.com> writes:
> 
> > Hi Eric, and all,
> >
> > On Mon, Sep 24, 2012 at 08:11:12PM -0700, Eric W. Biederman wrote:
> >> Cliff Wickman <cpw at sgi.com> writes:
> >> 
> >> > Gentlemen,
> >> >
> >> > In dumping very large memories we are running up against the 896MB 
> >> > limit in SLES11SP2 (3.0.38 kernel).
> >> 
> >> Odd.  That limit should be the maximum address in memory to load the
> >> crash kernel.  Tha limit should have nothing to do with the dump process
> >> itself.
> >> 
> >> Are you saying you need more that 512MiB reserved for the crash kernel
> >> to be able to dump all of the memory in your system?
> >> 
> >> Eric
> >
> > As I noted to Eric privately, yes we need to bump up to crashkernel=1G
> > or more for some very large memories.
> >
> > As an experiment I bumped
> > +++ linux/arch/x86/kernel/setup.c
> > @@ -528,7 +528,7 @@ static inline unsigned long long get_tot
> >  #ifdef CONFIG_X86_32
> >  # define CRASH_KERNEL_ADDR_MAX (512 << 20)
> >  #else
> > -# define CRASH_KERNEL_ADDR_MAX (896 << 20)
> > +# define CRASH_KERNEL_ADDR_MAX (1700 << 20)
> >
> > And that seems to work.  i.e. I'm currently dumping a system where
> > crashkernel=1G and it seems to be working.
> >
> > Am I just living dangerously? 
> 
> So fundamentally this should work.  However there have been a lot of
> kinks and silly limitations in the x86 boot protocol.
> 
> So it used to be that the bootloader protocol variable ramdisk_max was
> set to 896M for 32bit kernels.  Because the ramdisk could not be located
> in high memory.
> 
> Looking today it appears that ramdisk_max has been upped to 4G.
> 
> I will let you look through the /sbin/kexec source code.
> 
> As for testing I would up the limit to 4G on x86_64 and see how far
> you get.
> 
> The practical question does the system still work with crashkernel=32M
> when you have raised the limit much higher. 
> 
> So I would test with crashkernel=1G at 2G and see if that works.  If that
> works I figure that in practice all of the bugs are historical and we
> can forget them.  But a sweep through the /sbin/kexec code for the magic
> number 896 might not be out of order.
> 
> Eric

I did try setting the limit to 8G. The crashkernel did get loaded there
but it would not execute there.

It works fine on a UV to set the limit to 4G and use a
crashkernel=1280M.  We have a hole of almost 2G there.

The memory at 2G is already in use so I can't explicitly place it there.

The kernel patch looks like this:
Index: linux/arch/x86/kernel/setup.c
===================================================================

--- linux.orig/arch/x86/kernel/setup.c
+++ linux/arch/x86/kernel/setup.c
@@ -522,13 +522,12 @@ static inline unsigned long long get_tot
 /*
  * Keep the crash kernel below this limit.  On 32 bits earlier kernels
  * would limit the kernel to the low 512 MiB due to mapping
  * restrictions.
- * On 64 bits, kexec-tools currently limits us to 896 MiB; increase
  this
- * limit once kexec-tools are fixed.
+ * On 64 bits, the boot protocol limits us to 4G.
  */
 #ifdef CONFIG_X86_32
 # define CRASH_KERNEL_ADDR_MAX (512 << 20)
 #else
-# define CRASH_KERNEL_ADDR_MAX (896 << 20)
+# define CRASH_KERNEL_ADDR_MAX (1UL << 32)
 #endif

 static void __init reserve_crashkernel(void)

-Cliff
-- 
Cliff Wickman
SGI
cpw at sgi.com
(651) 683-3824