There are local complaints that filtering out only zero pages is slow. I found that is_zero_page was inefficient. It checks if the page contains any non-zero bytes - one byte at a time. Improve performance by checking for non-zero data 64 bits at a time. Also, unroll the loop for additional performance. Did testing in x86_64 mode on an Intel Xeon x5560 system with 18GB RAM. Executed: time makedumpfile -d 1 /proc/vmcore <destination> The amount of time taken in User space was reduced by 75%. The total time to dump memory was reduced by 28%. is_zero_page Signed-off-by: Marc Milgram <mmilgram at redhat.com> --- diff --git a/makedumpfile.h b/makedumpfile.h index 3d270c6..0f211c4 100644 --- a/makedumpfile.h +++ b/makedumpfile.h @@ -1634,10 +1634,27 @@ static inline int is_zero_page(unsigned char *buf, long page_size) { size_t i; + unsigned long long *vect = (unsigned long long *) buf; + long page_len = page_size / (sizeof(unsigned long long)); - for (i = 0; i < page_size; i++) - if (buf[i]) + for (i = 0; i < page_len; i+=8) { + if (vect[i]) return FALSE; + if (vect[i+1]) + return FALSE; + if (vect[i+2]) + return FALSE; + if (vect[i+3]) + return FALSE; + if (vect[i+4]) + return FALSE; + if (vect[i+5]) + return FALSE; + if (vect[i+6]) + return FALSE; + if (vect[i+7]) + return FALSE; + } return TRUE; }