problems in kdump kernel if 'maxcpus=1' not specified?

jlan@xxxxxxx (Jay Lan) · Wed, 16 Jul 2008 11:00:27 -0700

Vivek Goyal wrote:
> On Wed, Jul 16, 2008 at 11:25:44AM -0400, Neil Horman wrote:
>> On Wed, Jul 16, 2008 at 11:12:40AM -0400, Vivek Goyal wrote:
>>> On Tue, Jul 15, 2008 at 06:07:40PM -0700, Jay Lan wrote:
>>>> Are there known problems if you boot up kdump kernel with
>>>> multipl cpus?
>>>>
>>> I had run into one issue and that was some system would get reset and 
>>> jump to BIOS.
>>>
>>> The reason was that kdump kernel can boot on a non-boot cpu. When it
>>> tries to bring up other cpus it sends INIT and a non-boot cpu sending
>>> INIT to "boot" cpu was not acceptable (as per intel documentation) and 
>>> it re-initialized the system.
>>>
>>> I am not sure how many systems are affected with this behavior. Hence
>>> the reason for using maxcpus=1.
>>>
>> +1, there are a number of multi-cpu issues with kdump.  I've seen some systems
>> where you simply can't re-inialize a halted cpu from software, which causes
>> problems/hangs
>>
>>>> It takes unacceptably long time to run makedumpfile in
>>>> saving dump at a huge memory system. In my testing it
>>>> took 16hr25min to run create_dump_bitmap() on a 1TB system.
>>>> Pfn's are processed sequentially with single cpu. We
>>>> certainly can use multipl cpus here ;)
>>> This is certainly very long time. How much memory have you reserved for
>>> kdump kernel?
>>>
>>> I had run some tests on a x86_64 128GB RAM system and it took me 4 minutes
>>> to filter and save the core (maximum filtering level of 31). I had
>>> reserved 128MB of memory for kdump kernel.
>>>
>>> I think something else is seriously wrong here. 1 TB is almost 10 times of
>>> 128GM and even if time scales linearly it should not take more than
>>> 40mins.
>>>
>>> You need to dive deeper to find out what is taking so much of time.
>>>
>>> CCing kenichi.
>>>
>> You know, we might be able to get speedup's in makedumpfile without the use of
>> additional cpu's.  One of the things that concerned me when I read this was the
>> use of dump targets that need to be sequential.  i.e. multiple processes writing
>> to a local disk make good sense, but not so much if you're dumping over an scp
>> connection (don't want to re-order those writes).  The makedumpfile work cycle
>> goes something from 30000 feet like:
>>
>> 1) Inspect a page
>> 2) Decide to filter the page
>> 3) if (2) goto 1
>> 4) else compress page
>> 5) write page to target
> 
> I thought that it first creates the bitmap. So in first pass it just
> decides which are the pages to be dumped or filtered out and marks these
> in bitmap.
> 
> Then in second pass it starts dumping all the pages sequentially along
> with metadata, if any..
> 
>> I'm sure 4 is going to be the most cpu intensive task, but I bet we spend a lot
>> of idle time waiting for I/O to complete (since I'm sure we'll fill up pagecache
>> quickly).  What if makedumpfile used AIO to write out prepared pages to the dump
>> target?  That way we could at least free up some cpu cycles to work more quickly
>> on steps 2,3, and 4 
>>
> 
> If above assumption if right, then probably AIO might not help as once we
> marked the pages, we have no job but to wait for completion.
> 
> DIO might help a bit because we need not to fill page cache as we are 
> not going to need vmcore pages again.
> 
> In case of jay, it looks creating bitmaps itself took a long time. 

Yep. Most time was spent on creating bitmaps itself.

I was running a version of 1.2.6 makedumpfile and time consumed was
broken down to:
   create_dump_bitmap            16hr 25min
   excluding unnecessary pages   28min
   write_kdump_pages              2min
   Copying data                  19min

I reserved 3960M memory for the kdump kernel.

Regards,
 - jay

> 
> Vivek
> 
>> Thoughts?
>>
>> Neil
>>
>> -- 
>> /***************************************************
>>  *Neil Horman
>>  *Senior Software Engineer
>>  *Red Hat, Inc.
>>  *nhorman at redhat.com
>>  *gpg keyid: 1024D / 0x92A74FA1
>>  *http://pgp.mit.edu
>>  ***************************************************/