On Wed, Jul 16, 2008 at 12:23:43PM -0400, Vivek Goyal wrote: > On Wed, Jul 16, 2008 at 11:25:44AM -0400, Neil Horman wrote: > > On Wed, Jul 16, 2008 at 11:12:40AM -0400, Vivek Goyal wrote: > > > On Tue, Jul 15, 2008 at 06:07:40PM -0700, Jay Lan wrote: > > > > Are there known problems if you boot up kdump kernel with > > > > multipl cpus? > > > > > > > > > > I had run into one issue and that was some system would get reset and > > > jump to BIOS. > > > > > > The reason was that kdump kernel can boot on a non-boot cpu. When it > > > tries to bring up other cpus it sends INIT and a non-boot cpu sending > > > INIT to "boot" cpu was not acceptable (as per intel documentation) and > > > it re-initialized the system. > > > > > > I am not sure how many systems are affected with this behavior. Hence > > > the reason for using maxcpus=1. > > > > > +1, there are a number of multi-cpu issues with kdump. I've seen some systems > > where you simply can't re-inialize a halted cpu from software, which causes > > problems/hangs > > > > > > It takes unacceptably long time to run makedumpfile in > > > > saving dump at a huge memory system. In my testing it > > > > took 16hr25min to run create_dump_bitmap() on a 1TB system. > > > > Pfn's are processed sequentially with single cpu. We > > > > certainly can use multipl cpus here ;) > > > > > > This is certainly very long time. How much memory have you reserved for > > > kdump kernel? > > > > > > I had run some tests on a x86_64 128GB RAM system and it took me 4 minutes > > > to filter and save the core (maximum filtering level of 31). I had > > > reserved 128MB of memory for kdump kernel. > > > > > > I think something else is seriously wrong here. 1 TB is almost 10 times of > > > 128GM and even if time scales linearly it should not take more than > > > 40mins. > > > > > > You need to dive deeper to find out what is taking so much of time. > > > > > > CCing kenichi. > > > > > You know, we might be able to get speedup's in makedumpfile without the use of > > additional cpu's. One of the things that concerned me when I read this was the > > use of dump targets that need to be sequential. i.e. multiple processes writing > > to a local disk make good sense, but not so much if you're dumping over an scp > > connection (don't want to re-order those writes). The makedumpfile work cycle > > goes something from 30000 feet like: > > > > 1) Inspect a page > > 2) Decide to filter the page > > 3) if (2) goto 1 > > 4) else compress page > > 5) write page to target > > I thought that it first creates the bitmap. So in first pass it just > decides which are the pages to be dumped or filtered out and marks these > in bitmap. > > Then in second pass it starts dumping all the pages sequentially along > with metadata, if any.. > It might, but I don't think thats overly relevant, as I expect the major cpu usage point comes in during compression and the major wall clock time loss occurs during I/O > > > > I'm sure 4 is going to be the most cpu intensive task, but I bet we spend a lot > > of idle time waiting for I/O to complete (since I'm sure we'll fill up pagecache > > quickly). What if makedumpfile used AIO to write out prepared pages to the dump > > target? That way we could at least free up some cpu cycles to work more quickly > > on steps 2,3, and 4 > > > > If above assumption if right, then probably AIO might not help as once we > marked the pages, we have no job but to wait for completion. > I assume that we interleave page compression with I/O (i.e. compress a page from the bitmap, write the page to disk, repeat). If thats the case, then AIO would help because the kernel (or another thread) can wait on i/o completion while we continue and compress another page It will also help if a single context is unable to fill the I/O pipeline. IIRC multiple aio requests can be in flight at the same time, maximizing I/O bandwidth. And we can decide at the application level if our dump target will allow parallel I/O > DIO might help a bit because we need not to fill page cache as we are > not going to need vmcore pages again. > We currently do something simmilar to this in RHEL. The kdump initrd reduces dirty_ratio to almost zero, effectively creating a DIO environment. Numbers from there would give us an idea of how that performs > In case of jay, it looks creating bitmaps itself took a long time. > Do you have data for this? I've not seen it. Neil > Vivek > > > Thoughts? > > > > Neil > > > > -- > > /*************************************************** > > *Neil Horman > > *Senior Software Engineer > > *Red Hat, Inc. > > *nhorman at redhat.com > > *gpg keyid: 1024D / 0x92A74FA1 > > *http://pgp.mit.edu > > ***************************************************/ -- /*************************************************** *Neil Horman *Senior Software Engineer *Red Hat, Inc. *nhorman at redhat.com *gpg keyid: 1024D / 0x92A74FA1 *http://pgp.mit.edu ***************************************************/