On 02/15/2016 10:15 AM, "Zhou, Wenjian/???" wrote: > Hello Usui, > > Thanks very much for your comments. > And sorry for the late reply. > > See below. > > On 02/08/2016 01:00 PM, Minoru Usui wrote: >> Hello, >> >>> -----Original Message----- >>> From: kexec [mailto:kexec-bounces at lists.infradead.org] On Behalf Of Minoru Usui >>> Sent: Thursday, February 04, 2016 8:52 AM >>> To: Zhou Wenjian <zhouwj-fnst at cn.fujitsu.com>; kexec at lists.infradead.org >>> Subject: Re: [PATCH v1] Improve the performance of --num-threads -d 31 >>> >>> Hi, Zhou >>> >>> I have some comments. >>> I'm sorry if I have misunderstood your code. >>> >>>> -----Original Message----- >>>> From: kexec [mailto:kexec-bounces at lists.infradead.org] On Behalf Of Zhou Wenjian >>>> Sent: Monday, February 01, 2016 3:22 PM >>>> To: kexec at lists.infradead.org >>>> Subject: [PATCH v1] Improve the performance of --num-threads -d 31 >>>> >>>> v1: >>>> 1. change page_flag.ready's value to enum >>>> 2. change the patch description >>>> 3. cleanup some codes >>>> 4. fix a bug in cyclic mode >>>> >>>> multi-threads implementation will introduce extra cost when handling >>>> each page. The origin implementation will also do the extra work for >>>> filtered pages. So there is a big performance degradation in >>>> --num-threads -d 31. >>>> The new implementation won't do the extra work for filtered pages any >>>> more. So the performance of -d 31 is close to that of serial processing. >>>> >>>> The new implementation is just like the following: >>>> * The basic idea is producer producing page and consumer writing page. >>>> * Each producer have a page_flag_buf list which is used for storing >>>> page's description. >>>> * The size of page_flag_buf is little so it won't take too much memory. >>>> * And all producers will share a page_data_buf array which is >>>> used for storing page's compressed data. >>>> * The main thread is the consumer. It will find the next pfn and write >>>> it into file. >>>> * The next pfn is smallest pfn in all page_flag_buf. >>>> >>>> Signed-off-by: Zhou Wenjian <zhouwj-fnst at cn.fujitsu.com> >>>> --- >>>> makedumpfile.c | 258 ++++++++++++++++++++++++++++++++++++++------------------- >>>> makedumpfile.h | 31 ++++--- >>>> 2 files changed, 193 insertions(+), 96 deletions(-) >>>> >>>> diff --git a/makedumpfile.c b/makedumpfile.c >>>> index fa0b779..0ecd065 100644 >>>> --- a/makedumpfile.c >>>> +++ b/makedumpfile.c >>>> @@ -3483,7 +3483,8 @@ initial_for_parallel() >>>> unsigned long page_data_buf_size; >>>> unsigned long limit_size; >>>> int page_data_num; >>>> - int i; >>>> + struct page_flag *current; >>>> + int i, j; >>>> >>>> len_buf_out = calculate_len_buf_out(info->page_size); >>>> >>>> @@ -3562,8 +3563,10 @@ initial_for_parallel() >>>> - MAP_REGION * info->num_threads) * 0.6; >>>> >>>> page_data_num = limit_size / page_data_buf_size; >>>> + info->num_buffers = 3 * info->num_threads; >>>> >>>> - info->num_buffers = MIN(NUM_BUFFERS, page_data_num); >>>> + info->num_buffers = MAX(info->num_buffers, NUM_BUFFERS); >>>> + info->num_buffers = MIN(info->num_buffers, page_data_num); >>>> >>>> DEBUG_MSG("Number of struct page_data for produce/consume: %d\n", >>>> info->num_buffers); >>>> @@ -3588,6 +3591,36 @@ initial_for_parallel() >>>> } >>>> >>>> /* >>>> + * initial page_flag for each thread >>>> + */ >>>> + if ((info->page_flag_buf = malloc(sizeof(void *) * info->num_threads)) >>>> + == NULL) { >>>> + MSG("Can't allocate memory for page_flag_buf. %s\n", >>>> + strerror(errno)); >>>> + return FALSE; >>>> + } >>>> + memset(info->page_flag_buf, 0, sizeof(void *) * info->num_threads); >>>> + >>>> + for (i = 0; i < info->num_threads; i++) { >>>> + if ((info->page_flag_buf[i] = malloc(sizeof(struct page_flag))) == NULL) { >>> >>> Fist element of struct page_flag in circular list is allocated by malloc(), >>> but other elements are allocated by calloc().(see below) >>> I think both elements should be allocated by calloc(). >>> > > Yes, you are right. > I have made a mistake. > >>>> + MSG("Can't allocate memory for page_flag_buf. %s\n", >>>> + strerror(errno)); >>>> + return FALSE; >>>> + } >>>> + current = info->page_flag_buf[i]; >>>> + >>>> + for (j = 1; j < NUM_BUFFERS; j++) { >>>> + if ((current->next = calloc(0, sizeof(struct page_flag))) == NULL) { >>>> + MSG("Can't allocate memory for data of page_data_buf. %s\n", >>>> + strerror(errno)); >>>> + return FALSE; >>>> + } >>> >>> >>> First argument of calloc() should be 1, not 0. >>> And there is typo in error message. >>> Allocated element is not page_data_buf. >>> > > I agree. > >>>> + current = current->next; >>>> + } >>>> + current->next = info->page_flag_buf[i]; >>>> + } >>>> + >>>> + /* >>>> * initial fd_memory for threads >>>> */ >>>> for (i = 0; i < info->num_threads; i++) { >>>> @@ -3612,7 +3645,8 @@ initial_for_parallel() >>>> void >>>> free_for_parallel() >>>> { >>>> - int i; >>>> + int i, j; >>>> + struct page_flag *current; >>>> >>>> if (info->threads != NULL) { >>>> for (i = 0; i < info->num_threads; i++) { >>>> @@ -3655,6 +3689,19 @@ free_for_parallel() >>>> free(info->page_data_buf); >>>> } >>>> >>>> + if (info->page_flag_buf != NULL) { >>>> + for (i = 0; i < info->num_threads; i++) { >>>> + for (j = 0; j < NUM_BUFFERS; j++) { >>>> + if (info->page_flag_buf[i] != NULL) { >>>> + current = info->page_flag_buf[i]; >>>> + info->page_flag_buf[i] = current->next; >>>> + free(current); >>>> + } >>>> + } >>>> + } >>>> + free(info->page_flag_buf); >>>> + } >>>> + >>>> if (info->parallel_info == NULL) >>>> return; >>>> >>>> @@ -7076,10 +7123,10 @@ kdump_thread_function_cyclic(void *arg) { >>>> void *retval = PTHREAD_FAIL; >>>> struct thread_args *kdump_thread_args = (struct thread_args *)arg; >>>> struct page_data *page_data_buf = kdump_thread_args->page_data_buf; >>>> + struct page_flag *page_flag_buf = kdump_thread_args->page_flag_buf; >>>> struct cycle *cycle = kdump_thread_args->cycle; >>>> - int page_data_num = kdump_thread_args->page_data_num; >>>> mdf_pfn_t pfn; >>>> - int index; >>>> + int index = kdump_thread_args->thread_num; >>>> int buf_ready; >>>> int dumpable; >>>> int fd_memory = 0; >>>> @@ -7125,47 +7172,47 @@ kdump_thread_function_cyclic(void *arg) { >>>> kdump_thread_args->thread_num); >>>> } >>>> >>>> - while (1) { >>>> - /* get next pfn */ >>>> - pthread_mutex_lock(&info->current_pfn_mutex); >>>> - pfn = info->current_pfn; >>>> - info->current_pfn++; >>>> - pthread_mutex_unlock(&info->current_pfn_mutex); >>>> + /* >>>> + * filtered page won't take anything >>>> + * unfiltered zero page will only take a page_flag_buf >>>> + * unfiltered non-zero page will take a page_flag_buf and a page_data_buf >>>> + */ >>>> + while (page_flag_buf->pfn < kdump_thread_args->end_pfn) { >>> >>> At first, page_flag_buf->pfn is not initialized. >>> I think this block should be replaced with the following code. >>> >>> === >>> do { >>> : >>> } while(page_flag_buf->pfn < kdump_thread_args->end_pfn) >>> === >> >> I'm sorry, above suggestion is meaningless in terms of page_flag_buf->pfn is uninitialized. >> It should be replaced like following. >> >> === >> while (1) { >> : >> while (buf_ready == FALSE) { >> : >> if (pfn >= kdump_thread_args->end_pfn) { >> : >> goto finish; >> } >> : >> } >> : >> } >> finish: >> === >> > > page_flag_buf is allocated by calloc(). > The page_flag_buf->pfn's value is 0. > So I think it is not necessary to modify the code. > >> Thanks, >> Minoru Usui >> >> >>>> + buf_ready = FALSE; >>>> >>>> - if (pfn >= kdump_thread_args->end_pfn) >>>> - break; >>>> + while (page_data_buf[index].used != 0 || >>>> + pthread_mutex_trylock(&page_data_buf[index].mutex) != 0) >>>> + index = (index + 1) % info->num_buffers; >>>> >>>> - index = -1; >>>> - buf_ready = FALSE; >>>> + page_data_buf[index].used = 1; >>> >>> "1" is a magic number. >>> It should be changed TRUE or FALSE. >>> > > I see. > >>>> while (buf_ready == FALSE) { >>>> pthread_testcancel(); >>>> - >>>> - index = pfn % page_data_num; >>>> - >>>> - if (pfn - info->consumed_pfn > info->num_buffers) >>>> + if (page_flag_buf->ready == FLAG_READY) >>>> continue; >>> >>> At first, page_flag_buf->ready is uninitialized, too. >>> Should it be initialized in head part of this function, even if FLAG_UNUSED is defined 0? >>> >>> > > The same topic as the page_flag_buf is allocated by calloc(). > >>>> >>>> - if (page_data_buf[index].ready != 0) >>>> - continue; >>>> - >>>> - pthread_mutex_lock(&page_data_buf[index].mutex); >>>> + /* get next pfn */ >>>> + pthread_mutex_lock(&info->current_pfn_mutex); >>>> + pfn = info->current_pfn; >>>> + info->current_pfn++; >>>> + page_flag_buf->ready = FLAG_FILLING; >>>> + pthread_mutex_unlock(&info->current_pfn_mutex); >>>> >>>> - if (page_data_buf[index].ready != 0) >>>> - goto unlock; >>>> + page_flag_buf->pfn = pfn; >>> >>> It set FLAG_FILLING to page_flag_buf->ready before setting pfn to page_flag_buf->pfn. >>> But consumer gets page_flag_buf->pfn after checking page_flag_buf->ready != FLAG_UNUSED >>> in getting minimum pfn of each thread block. >>> Should it set page_flag_buf->pfn first? >>> > > Have you noticed the following code in the consumer? > <cut> > if (current_pfn == info->page_flag_buf[consuming]->pfn) > break; > <cut> > > The consumer will check if the pfn is changed after the page_flag_buf->ready turns to be FLAG_READY. > So it's not important whether setting page_flag_buf->pfn first or not. > > In the other hand, even setting page_flag_buf->pfn first, if the pfn is not dumpable, the producer > will also reset the page_flag_buf->pfn. > >>>> >>>> - buf_ready = TRUE; >>>> - >>>> - page_data_buf[index].pfn = pfn; >>>> - page_data_buf[index].ready = 1; >>>> + if (pfn >= kdump_thread_args->end_pfn) { >>>> + page_data_buf[index].used = 0; >>>> + page_flag_buf->ready = FLAG_READY; >>>> + info->current_pfn--; >>>> + break; >>>> + } >>> >>> This block decrements info->current_pfn without info->current_pfn_mutex. >>> I think this block should be moved into previous pthread_mutex_lock(info->current_pfn_mutex) block, so it can remove. >>> > > Why do you think it should have current_pfn_mutex? > > If pfn >= kdump_thread_args->end_pfn, info->current_pfn will always larger than > kdump_thread_args->end_pfn. info->current_pfn-- won't affect anything. > > The decrement operation is for cyclic mode. > Sorry, it seems I was wrong. It can't work well in cyclic mode. I will fix it in the next version. -- Thanks Zhou