Hi Zhou Wenjian and Kumagai, I have follow Zhou Wenjian's words to do some tests, in the condition of "-c", makdumpfile 1.5.9 does perform better than "-l". I have done more tests in a machine with 128G memory, in the condition of "-d 0" and "-d 3", the makedumpfile 1.5.9 performs well. But if with "--num-threads 1", it does need more time than without "--num-threads". Here is my results(makedumpfile -c): "-d 0" (the size of vmcore is 2.6G): --num-threads time(seconds) 0 556 1 1186 4 307 8 186 12 131 16 123 "-d 3" (the size of vmcore is 1.3G): --num-threads time(seconds) 0 141 1 262 2 137 4 91 8 121 16 137 So, I think makedumpfile 1.5.9 can save time in the condition of "-c" and not "-d 31" and not "--num-threads 1". ----- Original Message ----- > From: "Wenjian Zhou/???" <zhouwj-fnst at cn.fujitsu.com> > To: "Atsushi Kumagai" <ats-kumagai at wm.jp.nec.com> > Cc: kexec at lists.infradead.org > Sent: Friday, December 4, 2015 11:33:36 AM > Subject: Re: [PATCH RFC 00/11] makedumpfile: parallel processing > > Hello Kumagai, > > On 12/04/2015 10:30 AM, Atsushi Kumagai wrote: > > Hello, Zhou > > > >> On 12/02/2015 03:24 PM, Dave Young wrote: > >>> Hi, > >>> > >>> On 12/02/15 at 01:29pm, "Zhou, Wenjian/???" wrote: > >>>> I think there is no problem if other test results are as expected. > >>>> > >>>> --num-threads mainly reduces the time of compressing. > >>>> So for lzo, it can't do much help at most of time. > >>> > >>> Seems the help of --num-threads does not say it exactly: > >>> > >>> [--num-threads THREADNUM]: > >>> Using multiple threads to read and compress data of each page in > >>> parallel. > >>> And it will reduces time for saving DUMPFILE. > >>> This feature only supports creating DUMPFILE in kdump-comressed > >>> format from > >>> VMCORE in kdump-compressed format or elf format. > >>> > >>> Lzo is also a compress method, it should be mentioned that --num-threads > >>> only > >>> supports zlib compressed vmcore. > >>> > >> > >> Sorry, it seems that something I said is not so clear. > >> lzo is also supported. Since lzo compresses data at a high speed, the > >> improving of the performance is not so obvious at most of time. > >> > >>> Also worth to mention about the recommended -d value for this feature. > >>> > >> > >> Yes, I think it's worth. I forgot it. > > > > I saw your patch, but I think I should confirm what is the problem first. > > > >> However, when "-d 31" is specified, it will be worse. > >> Less than 50 buffers are used to cache the compressed page. > >> And even the page has been filtered, it will also take a buffer. > >> So if "-d 31" is specified, the filtered page will use a lot > >> of buffers. Then the page which needs to be compressed can't > >> be compressed parallel. > > > > Could you explain why compression will not be parallel in more detail ? > > Actually the buffers are used also for filtered pages, it sounds > > inefficient. > > However, I don't understand why it prevents parallel compression. > > > > Think about this, in a huge memory, most of the page will be filtered, and > we have 5 buffers. > > page1 page2 page3 page4 page5 page6 page7 ..... > [buffer1] [2] [3] [4] [5] > unfiltered filtered filtered filtered filtered unfiltered filtered > > Since filtered page will take a buffer, when compressing page1, > page6 can't be compressed at the same time. > That why it will prevent parallel compression. > > > Further, according to Chao's benchmark, there is a big performance > > degradation even if the number of thread is 1. (58s vs 240s) > > The current implementation seems to have some problems, we should > > solve them. > > > > If "-d 31" is specified, on the one hand we can't save time by compressing > parallel, on the other hand we will introduce some extra work by adding > "--num-threads". So it is obvious that it will have a performance > degradation. > > I'm not so sure if it is a problem that the performance degradation is so > big. > But I think if in other cases, it works as expected, this won't be a problem( > or a problem needs to be fixed), for the performance degradation existing > in theory. > > Or the current implementation should be replaced by a new arithmetic. > For example: > We can add an array to record whether the page is filtered or not. > And only the unfiltered page will take the buffer. > > But I'm not sure if it is worth. > For "-l -d 31" is fast enough, the new arithmetic also can't do much help. > > -- > Thanks > Zhou > > > > _______________________________________________ > kexec mailing list > kexec at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec >